Hi and welcome! I’ve been playing Pokemon since I was about 6 years old. The game has come a long way from the original 151 expanding to nearly 900 Pokemon! While some critics might say the quality of Pokemon designs has declined (Vanilluxe), it’s hard not to be impressed by the longevity of the franchise. I wanted to take some time to combine my childhood hobby with my current one.
First, let us take a look at the data we’re working with!
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
rm(list = ls())
library(tidyverse)
library(ggrepel)
library(png)
library(grid)
setwd("~/Desktop/code/R/Pokemon")
pokemon <- read_csv("Pokemon.csv")
glimpse(pokemon)
## Rows: 800
## Columns: 13
## $ `#` <dbl> 1, 2, 3, 3, 4, 5, 6, 6, 6, 7, 8, 9, 9, 10, 11, 12, 13, 14,…
## $ Name <chr> "Bulbasaur", "Ivysaur", "Venusaur", "VenusaurMega Venusaur…
## $ `Type 1` <chr> "Grass", "Grass", "Grass", "Grass", "Fire", "Fire", "Fire"…
## $ `Type 2` <chr> "Poison", "Poison", "Poison", "Poison", NA, NA, "Flying", …
## $ Total <dbl> 318, 405, 525, 625, 309, 405, 534, 634, 634, 314, 405, 530…
## $ HP <dbl> 45, 60, 80, 80, 39, 58, 78, 78, 78, 44, 59, 79, 79, 45, 50…
## $ Attack <dbl> 49, 62, 82, 100, 52, 64, 84, 130, 104, 48, 63, 83, 103, 30…
## $ Defense <dbl> 49, 63, 83, 123, 43, 58, 78, 111, 78, 65, 80, 100, 120, 35…
## $ `Sp. Atk` <dbl> 65, 80, 100, 122, 60, 80, 109, 130, 159, 50, 65, 85, 135, …
## $ `Sp. Def` <dbl> 65, 80, 100, 120, 50, 65, 85, 85, 115, 64, 80, 105, 115, 2…
## $ Speed <dbl> 45, 60, 80, 80, 65, 80, 100, 100, 100, 43, 58, 78, 78, 45,…
## $ Generation <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ Legendary <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FA…
There’s 800 rows of data encompassing the first six generations of Pokemon. The main series games are on the eight generation so this data is a little dated. If you compare the number of rows in the data with the “National Dex” numbering, you might be thinking “There’s only 721 Pokemon? How did we get 800 rows of data?”. This is because variant forms (such as “Mega” Evolutions) receive their own line of data. It looks like we have a column for most basic information we could want such as Name, Type (Primary and Secondary), Statistics, etc.
The data we have has plenty of information, but we can think of a few more columns to add from the columns we already have.
pokemon <- pokemon %>%
rename(SpAtk = `Sp. Atk`, SpDef = `Sp. Def`, Type1 = `Type 1`, Type2 = `Type 2`, PokedexNum = `#`)
pokemon <- pokemon %>%
mutate(AtkTotal = Attack + SpAtk,
DefTotal = Defense + SpDef,
isMega = grepl("Mega", Name, ignore.case = FALSE),
isMultiType = !is.na(Type2),
classification = if_else(isMega == TRUE, "Mega",
if_else(Legendary == TRUE, "Legendary", "Normal"))
)
Some of the data types weren’t given the proper classification (e.g. character instead of factor). So, we can manually change them to what we want them. I’ll change my “dbl” columns to “int” (not a huge deal), and some of my character columns to “factors” (important for modelling).
factor_cols = c("Generation", "classification")
int_cols = c("PokedexNum", "Total", "HP", "Attack", "Defense", "SpAtk", "SpDef", "Speed")
pokemon[factor_cols] <- lapply(pokemon[factor_cols], factor)
pokemon[int_cols] <- lapply(pokemon[int_cols], as.integer)
glimpse(pokemon)
## Rows: 800
## Columns: 18
## $ PokedexNum <int> 1, 2, 3, 3, 4, 5, 6, 6, 6, 7, 8, 9, 9, 10, 11, 12, 13,…
## $ Name <chr> "Bulbasaur", "Ivysaur", "Venusaur", "VenusaurMega Venu…
## $ Type1 <chr> "Grass", "Grass", "Grass", "Grass", "Fire", "Fire", "F…
## $ Type2 <chr> "Poison", "Poison", "Poison", "Poison", NA, NA, "Flyin…
## $ Total <int> 318, 405, 525, 625, 309, 405, 534, 634, 634, 314, 405,…
## $ HP <int> 45, 60, 80, 80, 39, 58, 78, 78, 78, 44, 59, 79, 79, 45…
## $ Attack <int> 49, 62, 82, 100, 52, 64, 84, 130, 104, 48, 63, 83, 103…
## $ Defense <int> 49, 63, 83, 123, 43, 58, 78, 111, 78, 65, 80, 100, 120…
## $ SpAtk <int> 65, 80, 100, 122, 60, 80, 109, 130, 159, 50, 65, 85, 1…
## $ SpDef <int> 65, 80, 100, 120, 50, 65, 85, 85, 115, 64, 80, 105, 11…
## $ Speed <int> 45, 60, 80, 80, 65, 80, 100, 100, 100, 43, 58, 78, 78,…
## $ Generation <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ Legendary <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
## $ AtkTotal <dbl> 114, 142, 182, 222, 112, 144, 193, 260, 263, 98, 128, …
## $ DefTotal <dbl> 114, 143, 183, 243, 93, 123, 163, 196, 193, 129, 160, …
## $ isMega <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, …
## $ isMultiType <lgl> TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE…
## $ classification <fct> Normal, Normal, Normal, Mega, Normal, Normal, Normal, …
Now let’s start making some fun graphs!
totals <- pokemon %>%
group_by(Type1) %>%
summarise(count = n())
# Generation 1 Color Scheme
pokemon %>%
ggplot(aes(x = fct_infreq(Type1))) +
geom_bar(fill = "#84ADD7", color = "#F2684A") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(title = "Frequency of Primary Types", x = "Type", y = "Frequency") +
geom_text(aes(Type1, count + 5, label = count, fill = NULL), data = totals)
This is my first graph so I thought it was appropriate to give it a Gen 1 color scheme. That means Blue and Red as those were the first two games released in North America. Note: In Japan, it was Green and Red, thus why the rereleased versions of Gen 1 were FireRed and LeafGreen.
Water types seems to be by far the most common Primary type, with Flying being the least common with a staggering 4! A fun fact about Flying types is that until Generation 5 there were no pure/primary Flying types. The number really is 3, because one of them (Tornadus) has two forms. On that note, we can say our data going forward is a little skewed by overrepresenting Pokemon with multiple forms without stat changes. It’s not a huge deal but something to take note of. The second least common type is Fairy, but the Fairy type was only recently introduced (in Generation 6)! I think it’s fascinating how so many types hover right around the 27-32 range. I’d like to think Water being the most common is a nod to how the Earth is mostly water, but that might be a little [Farfetch’d](https://bulbapedia.bulbagarden.net/wiki/Farfetch%27d_(Pok%C3%A9mon%29)
totals <- pokemon %>%
filter(!is.na(Type2)) %>%
group_by(Type2) %>%
summarise(count = n())
# Genration 2 Color Scheme
pokemon %>%
filter(!is.na(Type2)) %>%
ggplot(aes(x = fct_infreq(Type2))) +
geom_bar(fill = "#C8CFD7", color = "#feff6a") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(title = "Frequency of Secondary Types", x = "Type", y = "Frequency") +
geom_text(aes(Type2, count + 5, label = count, fill = NULL), data = totals)
Flying is the most common secondary type by a landslide. As previously mentioned, very few Pokemon have Flying as a primary type so this isn’t too shocking. I think it’s interesting to see Water and Normal shift to the back of pack here as well. Poison is commonly paired with Grass or Bug (two of the most common primary types). It’s interesting to see the creator’s choices between primary and secondary typing. For Pokemon with multiple types, I’m not sure what difference it makes which is “primary” and which is “secondary”.
Note 2: Second graph == Gen 2 Color Scheme (Gold & Silver)
type_combinations <- pokemon %>%
mutate(Type2 = ifelse(is.na(Type2), "", Type2)) %>%
group_by(Type1, Type2) %>%
summarise(count=n())
#Pikachu Color Scheme
type_combinations %>%
ggplot(aes(x=Type1,y=as.character(Type2))) +
geom_tile(aes(fill = count), show.legend = FALSE) +
geom_text(aes(label=count)) +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(x="Type 1", y="Type 2",
title="Type Combinations") +
scale_fill_gradient(low="#f6bd20", high="#c52018")
This visualizatoin gives us insight on which types match up. Some of these are less surprising than others. For example, Poison being combined with Bug and Grass isn’t hard to make sense of whereas Electric/Water might seem a little more contradictory. I think one interesting takeaway is that despite over 700 distinct Pokemon, only about half of the possible combinations have been explored! Imagine what a Ghost/Rock Pokemon could look like! Flying and Fairy obviously have the most ground to make up. There are 39 type combinations with exactly one Pokemon (not reversed i.e. Rock/Bug is different from Bug/Rock).
# Bug, Dark, Dragon, Electric, Fairy, Fighting, Fire, Flying, Ghost, Grass, Ground, Ice, Normal, Poison, Psychic, Rock, Steel, Water
type_colors = c("#A8B820", "#705848", "#7038F8", "#F8D030", "#EE99AC", "#C03028","#F08030","#A890F0",
"#705898", "#78C850", "#E0C068", "#98D8D8","#A8A878", "#A040A0", "#F85888", "#B8A038",
"#B8B8D0", "#6890F0")
type_colors_outline = c("#C6D16E", "#49392F", "#4924A1", "#A1871F", "#9B6470", "#7D1F1A", "#9C531F",
"#6D5E9C", "#493963", "#4E8234", "#927D44", "#638D8D", "#6D6D4E", "#682A68",
"#A13959", "#786824", "#787887", "#445E9C")
pokemon %>%
ggplot(aes(x = Type1, y = Total, fill = Type1, color = Type1)) +
geom_boxplot(show.legend = FALSE) +
labs(title = "Stats by Primary Type") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none")
Dragon seems to be the strongest type. This makes sense as Dragon Pokemon are typically rare and “pseudo legendary”. Most types have a median value between 400-450, while Dragon is right around 600 (for reference, it’s common for Legendary to have a total stat count of 600). The Flying type is skewed up by it’s small sample size consisting of one legendary and one pseudo-legendary evolutionary line. Psychic has the greatest variation having Pokemon ranging the whole spectrum! The big extremes are affecting by it containing some of the strongest Pokemon (Mewtwo/Deoxys/Hoopa) and weakest (Spoink/Abra).
pokemon %>%
filter(Legendary == TRUE) %>%
ggplot(aes(x=Type1, fill = Type1, color = Type1)) +
geom_bar(show.legend = FALSE) +
scale_fill_manual(values = type_colors[-c(1,6, 14)],
guide = "none") +
scale_color_manual(values = type_colors_outline[-c(1,6, 14)],
guide = "none") +
labs(title = "Primary Type of Legendary Pokemon")
This is an emphasis on Psychic and Dragon containing some of the most powerful Pokemon in existence. Notably some types are missing! These types are: Bug, Fighting, and Poison. To me, it’s a little surprising that these types haven’t received a single Legendary Pokemon in six generation, but there are only a few introduced each generation. I’m sure from a conceptual and marketing standpoint, it’s not easy to make a Pokemon centered around one of these types.
# Latios/Latias Color Scheme
pokemon %>%
ggplot(aes(fill = Legendary, x=Type1)) +
geom_bar(position="stack") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = c("#cd696e", "#7db5da")) +
labs(title = "Legendary Pokemon by Primary Type", x = "Primary Type", y = "Frequency")
Notice the lack of blue in three certain columns. This also shows the scarcity of Legendary Pokemon.
# Xerneas/Yveltal Color Scheme
pokemon %>%
ggplot(aes(fill = isMega, x=Type1)) +
geom_bar(position="stack") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = c("#e9351c", "#6275b9")) +
labs(title = "Mega Pokemon by Primary Type", x = "Primary Type", y = "Frequency")
Mega Pokemon are even more rare! They were only recently introduced in Generation 6 alongside the new Fairy type. When Mega Pokemon were first introduced only Pokemon from Generation 1 were granted Mega evolutions, but this exclusivity has since expanded. Mega Pokemon inserted an amazing new aspect to competitive Pokemon as many of them basically had the stats of a Legendary Pokemon (sometimes not allowed in the OU tier). Some Mega Pokemon were deemed “overpowered” and relegated to the “Ubers” tier. Mega Blaziken and Mega Gengar were two that I rememeber being banned relatively quick.
We’ve covered a lot of ground in regards to typing. Let us move on to inspecting the different splits of Pokemon stats.
is_outlier <- function(x) {
return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}
pokemon %>%
ggplot(aes(x = Type1, y = HP, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "HP by Primary Type")
pokemon %>%
filter(is_outlier(HP) == TRUE) %>%
mutate(HPPercent = round(HP / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, HPPercent)
## # A tibble: 19 x 7
## PokedexNum Name Type1 Type2 Total HP HPPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 40 Wigglytuff Normal Fairy 435 140 0.32
## 2 113 Chansey Normal <NA> 450 250 0.56
## 3 131 Lapras Water Ice 535 130 0.24
## 4 134 Vaporeon Water <NA> 525 130 0.25
## 5 143 Snorlax Normal <NA> 540 160 0.3
## 6 202 Wobbuffet Psychic <NA> 405 190 0.47
## 7 242 Blissey Normal <NA> 540 255 0.47
## 8 289 Slaking Normal <NA> 670 150 0.22
## 9 292 Shedinja Bug Ghost 236 1 0
## 10 297 Hariyama Fighting <NA> 474 144 0.3
## 11 320 Wailmer Water <NA> 400 130 0.32
## 12 321 Wailord Water <NA> 500 170 0.34
## 13 426 Drifblim Ghost Flying 498 150 0.3
## 14 446 Munchlax Normal <NA> 390 135 0.35
## 15 487 GiratinaAltered Forme Ghost Dragon 680 150 0.22
## 16 487 GiratinaOrigin Forme Ghost Dragon 680 150 0.22
## 17 594 Alomomola Water <NA> 470 165 0.35
## 18 716 Xerneas Fairy <NA> 680 126 0.19
## 19 717 Yveltal Dark Flying 680 126 0.19
HP seems to be pretty consistent across all types. The two big outliers for the Normal type are “Blissey” and “Chansey”. We’ll come back to them later. The Psychic outlier is Wobbuffet and is interesting because his/her HP stat accounts for nearly 50% of it’s total stats (similar to Chansey & Blissey). Chansey leads the pack in this regard with a 56% of her total stats are attributed to her HP (Chansey & Blissey has a 100% female rate). The Bug Pokemon that looks like it has no HP is Shuckle! Shuckle’s appeal (as we’ll see later as well) is his/her staggering Defensive stats.
pokemon %>%
ggplot(aes(x = Type1, y = Attack, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "ATk by Primary Type")
pokemon %>%
filter(is_outlier(Attack) == TRUE) %>%
mutate(AtkPercent = round(Attack / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, AtkPercent)
## # A tibble: 7 x 7
## PokedexNum Name Type1 Type2 Total Attack AtkPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 150 MewtwoMega Mewtwo X Psychic Fighting 780 190 0.24
## 2 214 HeracrossMega Heracross Bug Fighting 600 185 0.31
## 3 383 GroudonPrimal Groudon Ground Fire 770 180 0.23
## 4 384 RayquazaMega Rayquaza Dragon Flying 780 180 0.23
## 5 386 DeoxysAttack Forme Psychic <NA> 600 180 0.3
## 6 445 GarchompMega Garchomp Dragon Ground 700 170 0.24
## 7 646 KyuremBlack Kyurem Dragon Ice 700 170 0.24
We see a number of outliers here. The two Psychic types that are off the charts are Mega Mewtwo X and Deoxys - Attack Form, two Pokemon that were essentially made to be overpowered in the Attack stat. Mega Heracross is the Bug outlier and shows the absurdity of Mega evolutions. Heracross is not a stellar Pokemon by most measures, yet it’s Mega evolution’s Attack stat rivals the highest in the game. The Normal lower outlier is Chansey with a whopping Attack stat of 5. We see more variation between types in the Attack stat than we did the HP stat. We see Fairy and Psychic with below average Attack stats because they are known for their SpAtk. This fits inline with their Pokemon typing and appearances. Fighting has an above average Attack stat. Again in line with the design of most Fighting types’ bulky, muscular appearance.
pokemon %>%
ggplot(aes(x = Type1, y = Defense, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "DEF by Primary Type")
pokemon %>%
filter(is_outlier(Defense) == TRUE) %>%
mutate(DefPercent = round(Defense / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Defense, DefPercent)
## # A tibble: 13 x 7
## PokedexNum Name Type1 Type2 Total Defense DefPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 80 SlowbroMega Slowbro Water Psychic 590 180 0.31
## 2 91 Cloyster Water Ice 525 180 0.34
## 3 95 Onix Rock Ground 385 160 0.42
## 4 208 Steelix Steel Ground 510 200 0.39
## 5 208 SteelixMega Steelix Steel Ground 610 230 0.38
## 6 213 Shuckle Bug Rock 505 230 0.46
## 7 306 Aggron Steel Rock 530 180 0.34
## 8 306 AggronMega Aggron Steel <NA> 630 230 0.37
## 9 377 Regirock Rock <NA> 580 200 0.34
## 10 383 GroudonPrimal Groudon Ground Fire 770 160 0.21
## 11 386 DeoxysDefense Forme Psychic <NA> 600 160 0.27
## 12 411 Bastiodon Rock Steel 495 168 0.34
## 13 713 Avalugg Ice <NA> 514 184 0.36
We continue to see this pattern of types’ appearance and their stats. Steel and Rock (and to a lesser extend, Ground) are usually made out of hard material and this is reflected in the defense stat. Shuckle is an exception to the Bug rule as bug are typically small creatures, Shuckle had a rock hard shell and thus his secondary typing. In fact, many of these are either depicted with some hard material (e.g. steel/rock) or a shell.
pokemon %>%
ggplot(aes(x = Type1, y = SpAtk, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "SpAtk by Primary Type")
pokemon %>%
filter(is_outlier(SpAtk) == TRUE) %>%
mutate(SpAtkPercent = round(SpAtk / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, SpAtk, SpAtkPercent)
## # A tibble: 10 x 7
## PokedexNum Name Type1 Type2 Total SpAtk SpAtkPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 65 AlakazamMega Alakazam Psychic <NA> 590 175 0.3
## 2 94 GengarMega Gengar Ghost Poison 600 170 0.28
## 3 150 MewtwoMega Mewtwo Y Psychic <NA> 780 194 0.25
## 4 181 AmpharosMega Ampharos Electric Dragon 610 165 0.27
## 5 282 GardevoirMega Gardevoir Psychic Fairy 618 165 0.27
## 6 382 KyogrePrimal Kyogre Water <NA> 770 180 0.23
## 7 384 RayquazaMega Rayquaza Dragon Flying 780 180 0.23
## 8 386 DeoxysAttack Forme Psychic <NA> 600 180 0.3
## 9 646 KyuremWhite Kyurem Dragon Ice 700 170 0.24
## 10 720 HoopaHoopa Unbound Psychic Dark 680 170 0.25
We see more variance with the Special Attack stat than any other statistic. On top of the type-to-type variance, we see large variance between types especially within the Psychic, Dragon, Electric, and Water typings.
pokemon %>%
ggplot(aes(x = Type1, y = SpDef, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "SpDef by Primary Type")
pokemon %>%
filter(is_outlier(SpDef) == TRUE) %>%
mutate(SpDefPercent = round(SpDef / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, SpDef, SpDefPercent)
## # A tibble: 7 x 7
## PokedexNum Name Type1 Type2 Total SpDef SpDefPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 213 Shuckle Bug Rock 505 230 0.46
## 2 249 Lugia Psychic Flying 680 154 0.23
## 3 250 Ho-oh Fire Flying 680 154 0.23
## 4 378 Regice Ice <NA> 580 200 0.34
## 5 382 KyogrePrimal Kyogre Water <NA> 770 160 0.21
## 6 386 DeoxysDefense Forme Psychic <NA> 600 160 0.27
## 7 671 Florges Fairy <NA> 552 154 0.28
Special Defense shows less variance between types. Again, we see Psychic and Dragon types have higher than average Special Defense stats. Let’s remember that the reason we continue to see these two typing at the top of many of our charts is that many of the best Pokemon have these typings. Fairy and Electric also appear to be higher than average but not as significant.
pokemon %>%
ggplot(aes(x = Type1, y = Speed, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Speed by Primary Type")
pokemon %>%
filter(is_outlier(Speed) == TRUE) %>%
mutate(SpeedPercent = round(Speed / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Speed, SpeedPercent)
## # A tibble: 2 x 7
## PokedexNum Name Type1 Type2 Total Speed SpeedPercent
## <int> <chr> <chr> <chr> <int> <int> <dbl>
## 1 291 Ninjask Bug Flying 456 160 0.35
## 2 386 DeoxysSpeed Forme Psychic <NA> 600 180 0.3
Flying’s high boxplot can again be attributed to a small sample size, while Psychic’s wide variance can be attributed to the different type of Pokemon found in the typing. Electric and Dragon have high Speed, while Fairy is below average.
pokemon %>%
ggplot(aes(x = Type1, y = AtkTotal, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Total Atk by Primary Type")
pokemon %>%
filter(is_outlier(AtkTotal) == TRUE) %>%
mutate(AtkPercent = round(AtkTotal / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, AtkTotal, AtkPercent)
## # A tibble: 16 x 7
## PokedexNum Name Type1 Type2 Total AtkTotal AtkPercent
## <int> <chr> <chr> <chr> <int> <dbl> <dbl>
## 1 150 MewtwoMega Mewtwo X Psychic Fighting 780 344 0.44
## 2 150 MewtwoMega Mewtwo Y Psychic <NA> 780 344 0.44
## 3 257 BlazikenMega Blaziken Fire Fighting 630 290 0.46
## 4 381 LatiosMega Latios Dragon Psychic 700 290 0.41
## 5 382 KyogrePrimal Kyogre Water <NA> 770 330 0.43
## 6 383 GroudonPrimal Groudon Ground Fire 770 330 0.43
## 7 384 Rayquaza Dragon Flying 680 300 0.44
## 8 384 RayquazaMega Rayquaza Dragon Flying 780 360 0.46
## 9 386 DeoxysNormal Forme Psychic <NA> 600 300 0.5
## 10 386 DeoxysAttack Forme Psychic <NA> 600 360 0.6
## 11 445 GarchompMega Garchomp Dragon Ground 700 290 0.41
## 12 646 KyuremBlack Kyurem Dragon Ice 700 290 0.41
## 13 646 KyuremWhite Kyurem Dragon Ice 700 290 0.41
## 14 681 AegislashBlade Forme Steel Ghost 520 300 0.580
## 15 719 DiancieMega Diancie Rock Fairy 700 320 0.46
## 16 720 HoopaHoopa Unbound Psychic Dark 680 330 0.49
Here we examine the combined offensive prowness (Attack + SpAtk) to emphasize the powerhouse the Dragon type is. Pokemon with high Atk and SpAtk stats are typically known as “Mixed-Attackers”, while high Atk or SpAtk would be “Physical Sweepers” or “Special Sweepers”, respectively. We notice the Psychic type fall in line with the rest of the types when we measure Total Atk because they typically have low Atk stats.
pokemon %>%
ggplot(aes(x = Type1, y = DefTotal, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Total Def by Primary Type")
pokemon %>%
filter(is_outlier(DefTotal) == TRUE) %>%
mutate(DefPercent = round(DefTotal / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, DefTotal, DefPercent)
## # A tibble: 12 x 7
## PokedexNum Name Type1 Type2 Total DefTotal DefPercent
## <int> <chr> <chr> <chr> <int> <dbl> <dbl>
## 1 208 SteelixMega Steelix Steel Ground 610 325 0.53
## 2 213 Shuckle Bug Rock 505 460 0.91
## 3 306 AggronMega Aggron Steel <NA> 630 310 0.49
## 4 377 Regirock Rock <NA> 580 300 0.52
## 5 378 Regice Ice <NA> 580 300 0.52
## 6 379 Registeel Steel <NA> 580 300 0.52
## 7 386 DeoxysDefense Forme Psychic <NA> 600 320 0.53
## 8 411 Bastiodon Rock Steel 495 306 0.62
## 9 476 Probopass Rock Steel 525 295 0.56
## 10 681 AegislashShield Forme Steel Ghost 520 300 0.580
## 11 703 Carbink Rock Fairy 500 300 0.6
## 12 719 Diancie Rock Fairy 600 300 0.5
Again, we see Steel, Rock, and Dragon rise to the top, but not as significant as before because Steel and Rock’s high Def stats are offset by their average SpDef stats.
# Legendary Birds Color Scheme
pokemon %>%
ggplot(aes(x = classification, y = Total, color = classification, fill = classification)) +
geom_boxplot(show.legend = FALSE) +
scale_fill_manual(values = c("#d50808", "#ffd541", "#94c5ff"),
guide = "none") +
scale_color_manual(values = c("#ffc54a", "#9c7b10", "#005273"),
guide = "none") +
labs(title = "Total Stats by Classification")
We see how comparable the Legendary and Mega Pokemon are. The introduction of Mega Pokemon essentially introduced another evolution for fan favorite Pokemon to have near Legendary stats. Other overpowered formes were introduced such as Primal for Kyogre and Groundon. Also note the overlap between normal Pokemon and Legendary/Mega. This overlap is mostly due to pseudo-legendary Pokemon. These Pokemon are typically Dragon types found late in the game who have base stats that sum to 600, a very high total.
pokemon %>%
ggplot(aes(x=Total)) +
geom_density(alpha=0.5, aes(fill=Type1)) +
facet_wrap(~Type1) +
labs(x="Total", y="Density") +
scale_fill_manual(values = type_colors,
guide = "none")
Most types have a peak or two, mostly explained by the average stat totals as Pokemon evolve. The Psychic appears to eb an exception as there does not appear to be a peak rather a smooth, uniform density. Steel, Fairy, Dark do not have as many evolutions as some of the other types so they are basically unimodal.
# Generation Mascot Color Scheme
pokemon %>%
ggplot(aes(x = Generation, y = Total, color = Generation, fill = Generation)) +
geom_boxplot() +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#205a94", "#181820", "#6275b9"),
guide = "none") +
scale_color_manual(values = c("#F2684A", "#9cace6", "#313973", "#bd6ad5", "#bdbdd5", "#e9351c"),
guide = "none") +
labs(title = "Total Stats by Generation")
Interestingly enough, there does not appear to be a significant difference in the stat Totals between the 6 generations. Generation 4 has a slightly higher average and I believe this might be due to the introduction of many new evolutions for earlier Pokemon such as Magmortar, Electrivire, etc.
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk > 100) %>%
ggplot(aes(x = Speed, y = MaxAtk)) +
geom_point(aes(color = Type1)) +
geom_smooth(method = 'lm') +
scale_color_manual(values = type_colors) +
labs(title = "Offensive Potential (Speed vs. MaxAtk)")
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk >= 160 & Speed > 120) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, SpAtk, MaxAtk,Speed)
## # A tibble: 5 x 9
## PokedexNum Name Type1 Type2 Total Attack SpAtk MaxAtk Speed
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int>
## 1 65 AlakazamMega Alaka… Psychic <NA> 590 50 175 175 150
## 2 94 GengarMega Gengar Ghost Poison 600 65 170 170 130
## 3 150 MewtwoMega Mewtwo X Psychic Fighti… 780 190 154 190 130
## 4 150 MewtwoMega Mewtwo Y Psychic <NA> 780 150 194 194 140
## 5 386 DeoxysAttack Forme Psychic <NA> 600 180 180 180 150
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk >= 100 & Speed < 40) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, SpAtk, MaxAtk,Speed)
## # A tibble: 20 x 9
## PokedexNum Name Type1 Type2 Total Attack SpAtk MaxAtk Speed
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int>
## 1 80 Slowbro Water Psych… 490 75 100 100 30
## 2 80 SlowbroMega Slowbro Water Psych… 590 75 130 130 30
## 3 143 Snorlax Normal <NA> 540 110 65 110 30
## 4 185 Sudowoodo Rock <NA> 410 100 30 100 30
## 5 192 Sunflora Grass <NA> 425 75 105 105 30
## 6 199 Slowking Water Psych… 490 75 100 100 30
## 7 208 SteelixMega Steelix Steel Ground 610 125 55 125 30
## 8 323 CameruptMega Camer… Fire Ground 560 120 145 145 20
## 9 328 Trapinch Ground <NA> 290 100 45 100 10
## 10 460 AbomasnowMega Abom… Grass Ice 594 132 132 132 30
## 11 518 Musharna Psychic <NA> 487 55 107 107 29
## 12 525 Boldore Rock <NA> 390 105 50 105 20
## 13 526 Gigalith Rock <NA> 515 135 60 135 25
## 14 565 Carracosta Water Rock 495 108 83 108 32
## 15 577 Solosis Psychic <NA> 290 30 105 105 20
## 16 578 Duosion Psychic <NA> 370 40 125 125 30
## 17 579 Reuniclus Psychic <NA> 490 65 125 125 30
## 18 589 Escavalier Bug Steel 495 135 60 135 20
## 19 680 Doublade Steel Ghost 448 110 45 110 35
## 20 713 Avalugg Ice <NA> 514 117 44 117 28
Here, I wanted to examine Offensive Potential by looking at Pokemon with high Speed and Atk/SpAtk stats. The issue in only looking at these two stats is that they usually come with poor Defense or HP. Pokemon with high Speed and Atk/SpAtk but low HP/Def/SpDef are usually known as “Glass Cannons”. Mega Alakazam is a great example of this.
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
ggplot(aes(x = HP, y = MaxDef)) +
geom_point(aes(color = Type1)) +
geom_smooth(method = 'lm') +
scale_color_manual(values = type_colors) +
labs(title = "Wall Potential (HP vs. MaxDef)")
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
filter(HP >= 150) %>%
arrange(-HP) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, Defense, SpDef, MaxDef)
## # A tibble: 10 x 9
## PokedexNum Name Type1 Type2 Total HP Defense SpDef MaxDef
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int>
## 1 242 Blissey Normal <NA> 540 255 10 135 135
## 2 113 Chansey Normal <NA> 450 250 5 105 105
## 3 202 Wobbuffet Psychic <NA> 405 190 58 58 58
## 4 321 Wailord Water <NA> 500 170 45 45 45
## 5 594 Alomomola Water <NA> 470 165 80 45 80
## 6 143 Snorlax Normal <NA> 540 160 65 110 110
## 7 289 Slaking Normal <NA> 670 150 100 65 100
## 8 426 Drifblim Ghost Flyi… 498 150 44 54 54
## 9 487 GiratinaAltered Fo… Ghost Drag… 680 150 120 120 120
## 10 487 GiratinaOrigin For… Ghost Drag… 680 150 100 100 100
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
filter(MaxDef >= 150) %>%
arrange(-MaxDef) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, Defense, SpDef, MaxDef)
## # A tibble: 28 x 9
## PokedexNum Name Type1 Type2 Total HP Defense SpDef MaxDef
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int>
## 1 208 SteelixMega Steelix Steel Ground 610 75 230 95 230
## 2 213 Shuckle Bug Rock 505 20 230 230 230
## 3 306 AggronMega Aggron Steel <NA> 630 70 230 80 230
## 4 208 Steelix Steel Ground 510 75 200 65 200
## 5 377 Regirock Rock <NA> 580 80 200 100 200
## 6 378 Regice Ice <NA> 580 80 100 200 200
## 7 713 Avalugg Ice <NA> 514 95 184 46 184
## 8 80 SlowbroMega Slowbro Water Psychic 590 95 180 80 180
## 9 91 Cloyster Water Ice 525 50 180 45 180
## 10 306 Aggron Steel Rock 530 70 180 60 180
## # … with 18 more rows
Here I wanted to look for “Walls”, or high HP and Def/SpDef. These Pokemon are usually part of a “stall” strategy where they inflict status moves and use healing moves while your chiping away at your HP. Similar to the Offensive graph, it fails to map out the other stats, however these “Walls” usually have poor offensive stats.
# Scary Max
pokemon %>%
summarise(n = max(PokedexNum + 1),
name = "OP",
Total = max(HP)+max(Attack)+max(SpAtk)+max(Defense)+max(SpDef) + max(Speed),
HP = max(HP),
Attack = max(Attack),
SpAtk = max(SpAtk),
Defense = max(Defense),
SpDef = max(SpDef),
Speed = max(Speed))
## # A tibble: 1 x 9
## n name Total HP Attack SpAtk Defense SpDef Speed
## <dbl> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 722 OP 1279 255 190 194 230 230 180
pokemon %>%
filter(Total == max(Total))
## # A tibble: 3 x 18
## PokedexNum Name Type1 Type2 Total HP Attack Defense SpAtk SpDef Speed
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 150 Mewt… Psyc… Figh… 780 106 190 100 154 100 130
## 2 150 Mewt… Psyc… <NA> 780 106 150 70 194 120 140
## 3 384 Rayq… Drag… Flyi… 780 105 180 100 180 100 115
## # … with 7 more variables: Generation <fct>, Legendary <lgl>, AtkTotal <dbl>,
## # DefTotal <dbl>, isMega <lgl>, isMultiType <lgl>, classification <fct>
pokemon %>%
summarise(n = min(PokedexNum - 1),
name = "OP",
Total = min(HP)+min(Attack)+min(SpAtk)+min(Defense)+min(SpDef) + min(Speed),
HP = min(HP),
Attack = min(Attack),
SpAtk = min(SpAtk),
Defense = min(Defense),
SpDef = min(SpDef),
Speed = min(Speed))
## # A tibble: 1 x 9
## n name Total HP Attack SpAtk Defense SpDef Speed
## <dbl> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 0 OP 46 1 5 10 5 20 5
pokemon %>%
filter(Total == min(Total))
## # A tibble: 1 x 18
## PokedexNum Name Type1 Type2 Total HP Attack Defense SpAtk SpDef Speed
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 191 Sunk… Grass <NA> 180 30 30 30 30 30 30
## # … with 7 more variables: Generation <fct>, Legendary <lgl>, AtkTotal <dbl>,
## # DefTotal <dbl>, isMega <lgl>, isMultiType <lgl>, classification <fct>
pokemon %>%
summarise(n = as.integer(mean(PokedexNum - 1)),
name = "OP",
Total = as.integer(mean(HP))+as.integer(mean(Attack))+as.integer(mean(SpAtk))+as.integer(mean(Defense))+as.integer(mean(SpDef)) + as.integer(mean(Speed)),
HP = as.integer(mean(HP)),
Attack = as.integer(mean(Attack)),
SpAtk = as.integer(mean(SpAtk)),
Defense = as.integer(mean(Defense)),
SpDef = as.integer(mean(SpDef)),
Speed = as.integer(mean(Speed)))
## # A tibble: 1 x 9
## n name Total HP Attack SpAtk Defense SpDef Speed
## <int> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 361 OP 432 69 79 72 73 71 68
pokemon %>%
filter(Total == 432)
## # A tibble: 0 x 18
## # … with 18 variables: PokedexNum <int>, Name <chr>, Type1 <chr>, Type2 <chr>,
## # Total <int>, HP <int>, Attack <int>, Defense <int>, SpAtk <int>,
## # SpDef <int>, Speed <int>, Generation <fct>, Legendary <lgl>,
## # AtkTotal <dbl>, DefTotal <dbl>, isMega <lgl>, isMultiType <lgl>,
## # classification <fct>
pokemon %>%
summarise(n = as.integer(median(PokedexNum - 1)),
name = "OP",
Total = as.integer(median(HP))+as.integer(median(Attack))+as.integer(median(SpAtk))+as.integer(median(Defense))+as.integer(median(SpDef)) + as.integer(median(Speed)),
HP = as.integer(median(HP)),
Attack = as.integer(median(Attack)),
SpAtk = as.integer(median(SpAtk)),
Defense = as.integer(median(Defense)),
SpDef = as.integer(median(SpDef)),
Speed = as.integer(median(Speed)))
## # A tibble: 1 x 9
## n name Total HP Attack SpAtk Defense SpDef Speed
## <int> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 363 OP 410 65 75 65 70 70 65
pokemon %>%
filter(Total == 410)
## # A tibble: 9 x 18
## PokedexNum Name Type1 Type2 Total HP Attack Defense SpAtk SpDef Speed
## <int> <chr> <chr> <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 77 Pony… Fire <NA> 410 50 85 55 65 65 90
## 2 185 Sudo… Rock <NA> 410 70 100 115 30 65 30
## 3 219 Magc… Fire Rock 410 50 50 120 80 80 30
## 4 247 Pupi… Rock Grou… 410 70 84 70 65 70 51
## 5 308 Medi… Figh… Psyc… 410 60 60 75 60 75 80
## 6 364 Seal… Ice Water 410 90 60 70 75 70 45
## 7 400 Biba… Norm… Water 410 79 85 60 55 60 71
## 8 444 Gabi… Drag… Grou… 410 68 90 65 50 55 82
## 9 611 Frax… Drag… <NA> 410 66 117 70 40 50 67
## # … with 7 more variables: Generation <fct>, Legendary <lgl>, AtkTotal <dbl>,
## # DefTotal <dbl>, isMega <lgl>, isMultiType <lgl>, classification <fct>
This is just a fun look at taking the max/min/mean/median of all stats and how it measures up against actual Pokemon.
# Chansey Color Scheme
pokemon %>%
ggplot(aes(x=HP)) +
geom_histogram(binwidth=4, fill="#ffacac", colour="#ff835a") +
labs(x="HP", y="Frequency")
# Landorus Color Scheme
pokemon %>%
ggplot(aes(x=Attack)) +
geom_histogram(binwidth=4, fill="#f67b41", colour="#83624a") +
labs(x="Attack", y="Frequency")
# Greninja
pokemon %>%
ggplot(aes(x=SpAtk)) +
geom_histogram(binwidth=4, fill="#354698", colour="#e7788d") +
labs(x="SpAtk", y="Frequency")
# Steelix
pokemon %>%
ggplot(aes(x=Defense)) +
geom_histogram(binwidth=4, fill="#7b94a4", colour="#dee6de") +
labs(x="Defense", y="Frequency")
# Shuckle
pokemon %>%
ggplot(aes(x=SpDef)) +
geom_histogram(binwidth=4, fill="#b43129", colour="#ffff5a") +
labs(x="SpDef", y="Frequency")
#Deoxys
pokemon %>%
ggplot(aes(x=Speed)) +
geom_histogram(binwidth=4, fill="#5294ac", colour="#ff734a") +
labs(x="Speed", y="Frequency")
#Mewtwo
pokemon %>%
ggplot(aes(x=Total)) +
geom_histogram(binwidth=10, fill="#6a319c", colour="#b4acc5") +
labs(x="Total", y="Frequency")
Histograms displaying the spread of each stat.
pokemon %>%
ggplot(aes(x=HP, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="HP", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Attack, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Attack", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=SpAtk, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="SpAtk", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Defense, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Defense", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=SpDef, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="SpDef", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Speed, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Speed", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Total, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Total", y="Density", title = "Legendary Comparison")
Density graphs showing the disparity between normal Pokemon and Legendary Pokemon for all stats.
pokemon %>%
group_by(Generation) %>%
summarise(avg = as.integer(mean(Total))) %>%
ggplot(aes(x=Generation, y = avg, group = 1)) +
geom_line() +
geom_point(color = "red") +
labs(title = "Average Total for each Generation")
Mapping the average total in each Generation. (similar to the boxcharts above)
pokemon %>%
group_by(Generation) %>%
summarize(HP=mean(HP),
Attack=mean(Attack),
Defense=mean(Defense),
Sp..Atk=mean(SpAtk),
Sp..Def=mean(SpDef),
Speed=mean(Speed)) %>%
gather(Stats, value, 2:7) %>%
ggplot(aes(x=Generation, y=value, group=1)) +
geom_line() +
geom_point(color = "red") +
facet_wrap(~Stats) +
labs(y="Average Stats")
The average for each stat across Generations. We see an downtick in Generation 2 for Attack, SpAtk, and Speed. Defense, SpDef, and HP do not show much variance.
#https://drmowinckels.io/blog/adding-external-images-to-plots/
#download.file("https://cdn.bulbagarden.net/upload/5/56/242Blissey.png", "blissey.png")
# download.file("http://cdn.bulbagarden.net/upload/f/f8/242MS.png", "blissey-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/ea/113MS.png", "chansey-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/f/fa/202MS.png", "wobb-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/ec/321MS.png", "wailord-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/5/5a/594MS.png", "alo-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/e0/143XYMS.png", "snorlax-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/0/0d/289MS.png", "slaking-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/e8/487MS.png", "g-alt-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2f/487OMS.png", "g-o-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/4/45/426MS.png", "drif-h-sprite.png")
hp_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-h-"),
pokemon %>%
top_n(10, HP) %>%
arrange(Name) %>%
mutate(rank = row_number(HP)) %>%
select(Name, rank)) %>%
arrange(rank)
hp_sprites
## # A tibble: 10 x 3
## images Name rank
## <chr> <chr> <int>
## 1 drif-h-sprite.png Drifblim 1
## 2 g-alt-h-sprite.png GiratinaAltered Forme 2
## 3 g-o-h-sprite.png GiratinaOrigin Forme 3
## 4 slaking-h-sprite.png Slaking 4
## 5 snorlax-h-sprite.png Snorlax 5
## 6 alo-h-sprite.png Alomomola 6
## 7 wailord-h-sprite.png Wailord 7
## 8 wobb-h-sprite.png Wobbuffet 8
## 9 chansey-h-sprite.png Chansey 9
## 10 blissey-h-sprite.png Blissey 10
img = readPNG("images/blissey.png")
g = rasterGrob(img, interpolate=TRUE)
g_sprite = list()
hp_plot <- pokemon %>%
select(Name, HP) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, HP), y=HP)) +
geom_bar(aes(fill=HP), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=HP)) +
scale_fill_gradient(low="#ff835a", high="#ffacac") +
coord_flip() +
labs(x="Name", title="Top 10 HP Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=7, ymin=160, ymax=260)
for(i in 1:nrow(hp_sprites)){
img = readPNG(paste0("images/",hp_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
hp_plot = hp_plot +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-15, ymax=2.5)
}
hp_plot
# download.file("https://archives.bulbagarden.net/media/upload/6/67/150MXMS.png", "mewtwo-a-x.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/72/214MMS.png", "hera-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/98/383PMS.png", "groudon-a-p.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-a-aform.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c0/646BMS.png", "kyurem-a-b.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/7f/445MMS.png", "garch-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/91/409MS.png", "ramp-a-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/475MMS.png", "gallade-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/5e/354MMS.png", "banette-a-m.png")
atk_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-a-"), pokemon %>%
top_n(10, Attack) %>%
arrange(Name) %>%
mutate(rank = row_number(Attack)) %>%
select(Name, rank)) %>%
arrange(rank)
atk_sprites
## # A tibble: 10 x 3
## images Name rank
## <chr> <chr> <int>
## 1 banette-a-m.png BanetteMega Banette 1
## 2 gallade-a-m.png GalladeMega Gallade 2
## 3 ramp-a-sprite.png Rampardos 3
## 4 garch-a-m.png GarchompMega Garchomp 4
## 5 kyurem-a-b.png KyuremBlack Kyurem 5
## 6 deoxys-a-aform.png DeoxysAttack Forme 6
## 7 groudon-a-p.png GroudonPrimal Groudon 7
## 8 ray-a-m.png RayquazaMega Rayquaza 8
## 9 hera-a-m.png HeracrossMega Heracross 9
## 10 mewtwo-a-x.png MewtwoMega Mewtwo X 10
#download.file("https://cdn.bulbagarden.net/upload/7/7f/150Mewtwo-Mega_X.png", "mega-X.png")
img = readPNG("images/mega-X.png")
g = rasterGrob(img, interpolate=TRUE)
g_sprite = list()
atk_graph <- pokemon %>%
select(Name, Attack) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Attack), y=Attack)) +
geom_bar(aes(fill=Attack), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Attack)) +
scale_fill_gradient(low="#b4acc5", high="#6a319c") +
coord_flip() +
labs(x="Name", title="Top 10 Attack Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=155, ymax=210)
for(i in 1:nrow(atk_sprites)){
img = readPNG(paste0("images/",atk_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
atk_graph = atk_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
atk_graph
# download.file("https://archives.bulbagarden.net/media/upload/2/29/181MMS.png", "ampharos-spa-,.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/34/282MMS.png", "gardevoir-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/f4/094MMS.png", "gengar-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/64/720UMS.png", "hoopa-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/74/646WMS.png", "kyurem-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0c/065MMS.png", "alakazam-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-spa-.png")
spatk_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",pattern = "-spa-"), pokemon %>%
top_n(10, SpAtk) %>%
arrange(Name) %>%
mutate(rank = row_number(SpAtk)) %>%
select(Name, rank)) %>%
arrange(rank)
spatk_sprites
## # A tibble: 10 x 3
## images Name rank
## <chr> <chr> <int>
## 1 ampharos-spa-,.png AmpharosMega Ampharos 1
## 2 gardevoir-spa-.png GardevoirMega Gardevoir 2
## 3 gengar-spa-.png GengarMega Gengar 3
## 4 hoopa-spa-.png HoopaHoopa Unbound 4
## 5 kyurem-spa-.png KyuremWhite Kyurem 5
## 6 alakazam-spa-.png AlakazamMega Alakazam 6
## 7 deoxys-spa-.png DeoxysAttack Forme 7
## 8 kyogre-spa-.png KyogrePrimal Kyogre 8
## 9 ray-spa-.png RayquazaMega Rayquaza 9
## 10 mewtwo-spa-.png MewtwoMega Mewtwo Y 10
#download.file("https://cdn.bulbagarden.net/upload/5/5f/150Mewtwo-Mega_Y.png", "mega-Y.png")
img = readPNG("images/mega-Y.png")
g = rasterGrob(img, interpolate=TRUE)
spatk_graph <- pokemon %>%
select(Name, SpAtk) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, SpAtk), y=SpAtk)) +
geom_bar(aes(fill=SpAtk), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=SpAtk)) +
scale_fill_gradient(low="#b4acc5", high="#6a319c") +
coord_flip() +
labs(x="Name", title="Top 10 SpAtk Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=4, ymin=160, ymax=210)
for(i in 1:nrow(spatk_sprites)){
img = readPNG(paste0("images/",spatk_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
spatk_graph = spatk_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
spatk_graph
# download.file("https://archives.bulbagarden.net/media/upload/5/5b/306MMS.png", "aggron-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/d/d5/306MS.png", "aggron-d-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/53/713MS.png", "avalugg-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ef/411MS.png", "bastiodon-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ac/091XYMS.png", "cloyster-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/6c/377MS.png", "regirock-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/213MS.png", "shuckle-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/65/080MMS.png", "slobrow-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/b/bf/208MS.png", "steelix-d-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/05/208MMS.png", "steelix-d.png")
def_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-d"), pokemon %>%
top_n(10, Defense) %>%
arrange(Name) %>%
mutate(rank = row_number(Defense)) %>%
select(Name, rank)) %>%
arrange(rank)
def_sprites
## # A tibble: 10 x 3
## images Name rank
## <chr> <chr> <int>
## 1 bastiodon-d.png Bastiodon 1
## 2 aggron-d-n.png Aggron 2
## 3 cloyster-d.png Cloyster 3
## 4 slobrow-d.png SlowbroMega Slowbro 4
## 5 avalugg-d.png Avalugg 5
## 6 regirock-d.png Regirock 6
## 7 steelix-d-n.png Steelix 7
## 8 aggron-d.png AggronMega Aggron 8
## 9 shuckle-d.png Shuckle 9
## 10 steelix-d.png SteelixMega Steelix 10
#download.file("https://cdn.bulbagarden.net/upload/1/1b/208Steelix-Mega.png", "mega-steel.png")
img = readPNG("images/mega-steel.png")
g = rasterGrob(img, interpolate=TRUE)
def_graph <- pokemon %>%
select(Name, Defense) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Defense), y=Defense)) +
geom_bar(aes(fill=Defense), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Defense)) +
scale_fill_gradient(low="#dee6de", high="#7b94a4") +
coord_flip() +
labs(x="Name", title="Top 10 Defense Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=180, ymax=245)
for(i in 1:nrow(def_sprites)){
img = readPNG(paste0("images/",def_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
def_graph = def_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
def_graph
# download.file("https://archives.bulbagarden.net/media/upload/3/37/681MS.png", "aegislash-spd,.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/5b/703MS.png", "carbink-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/20/386DMS.png", "deoxys-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/53/719MS.png", "diancie-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ea/671MS.png", "florges-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/3c/706MS.png", "goodra-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ee/250MS.png", "Hooh-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/24/380MMS.png", "latias-spd-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c8/249MS.png", "lugia-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/aa/476MS.png", "probopass-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/99/378MS.png", "regice-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2e/379MS.png", "registeel-spd-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/213MS.png", "shuckle-spd.png")
spdef_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-spd"), pokemon %>%
top_n(10, SpDef) %>%
arrange(Name) %>%
mutate(rank = row_number(SpDef)) %>%
select(Name, rank)) %>%
arrange(rank)
spdef_sprites
## # A tibble: 14 x 3
## images Name rank
## <chr> <chr> <int>
## 1 aegislash-spd,.png AegislashShield Forme 1
## 2 carbink-spd.png Carbink 2
## 3 diancie-spd.png Diancie 3
## 4 goodra-spd.png Goodra 4
## 5 latias-spd-n.png LatiasMega Latias 5
## 6 probopass-spd.png Probopass 6
## 7 registeel-spd-n.png Registeel 7
## 8 florges-spd.png Florges 8
## 9 Hooh-spd.png Ho-oh 9
## 10 lugia-spd.png Lugia 10
## 11 deoxys-spd.png DeoxysDefense Forme 11
## 12 kyogre-spd.png KyogrePrimal Kyogre 12
## 13 regice-spd.png Regice 13
## 14 shuckle-spd.png Shuckle 14
#download.file("https://cdn.bulbagarden.net/upload/c/c7/213Shuckle.png", "shuckle.png")
img = readPNG("images/shuckle.png")
g = rasterGrob(img, interpolate=TRUE)
spdef_graph <- pokemon %>%
select(Name, SpDef) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, SpDef), y=SpDef)) +
geom_bar(aes(fill=SpDef), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=SpDef)) +
scale_fill_gradient(low="#ffff5a", high="#b43129") +
coord_flip() +
labs(x="Name", title="Top 10 SpDef Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=180, ymax=230)
for(i in 1:nrow(spdef_sprites)){
img = readPNG(paste0("images/", spdef_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
spdef_graph = spdef_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
spdef_graph
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-spe-a.png")
# download.file("https://archives.bulbagarden.net/media/upload/8/86/386MS.png", "deoxys-spe-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/fa/386SMS.png", "deoxys-spe-s.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/33/617MS.png", "accelgor-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c4/142MMS.png", "aerodac-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0c/065MMS.png", "alakazam-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c8/015MMS.png", "buzz-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/69/101MS.png", "electrocude-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/54/291MS.png", "ninjask-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/27/254MMS.png", "sceptile-spe.png")
speed_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-spe"), pokemon %>%
top_n(10, Speed) %>%
arrange(Name) %>%
mutate(rank = row_number(Speed)) %>%
select(Name, rank)) %>%
arrange(rank)
speed_sprites
## # A tibble: 11 x 3
## images Name rank
## <chr> <chr> <int>
## 1 electrocude-spe.png Electrode 1
## 2 mewtwo-spe.png MewtwoMega Mewtwo Y 2
## 3 accelgor-spe.png Accelgor 3
## 4 buzz-spe.png BeedrillMega Beedrill 4
## 5 sceptile-spe.png SceptileMega Sceptile 5
## 6 aerodac-spe.png AerodactylMega Aerodactyl 6
## 7 alakazam-spe.png AlakazamMega Alakazam 7
## 8 deoxys-spe-a.png DeoxysAttack Forme 8
## 9 deoxys-spe-n.png DeoxysNormal Forme 9
## 10 ninjask-spe.png Ninjask 10
## 11 deoxys-spe-s.png DeoxysSpeed Forme 11
#download.file("https://cdn.bulbagarden.net/upload/2/2b/386Deoxys-Speed.png", "speed.png")
img = readPNG("images/speed.png")
g = rasterGrob(img, interpolate=TRUE)
speed_graph <- pokemon %>%
select(Name, Speed) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Speed), y=Speed)) +
geom_bar(aes(fill=Speed), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Speed)) +
scale_fill_gradient(low="#5294ac", high="#ff734a") +
coord_flip() +
labs(x="Name", title="Top 10 Speed Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=150, ymax=200)
for(i in 1:nrow(speed_sprites)){
img = readPNG(paste0("images/",speed_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
speed_graph = speed_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
speed_graph
# download.file("https://archives.bulbagarden.net/media/upload/0/0e/493OD_DP.png", "arceus-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ae/719MMS.png", "diancie-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/7f/445MMS.png", "garchomp-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/98/383PMS.png", "groudon-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c0/646BMS.png", "kyurem-tol-b.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/74/646WMS.png", "kyurem-tol-w.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/24/380MMS.png", "latias-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2d/381MMS.png", "latios-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/54/376MMS.png", "metagross-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/67/150MXMS.png", "mewtwo-tol-x.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-tol-y.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/4/4a/373MMS.png", "salamence-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/59/248MMS.png", "ttar-tol.png")
total_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-tol"), pokemon %>%
top_n(10, Total) %>%
arrange(Name) %>%
mutate(rank = row_number(Total)) %>%
select(Name, rank)) %>%
arrange(rank)
total_sprites
## # A tibble: 15 x 3
## images Name rank
## <chr> <chr> <int>
## 1 diancie-tol.png DiancieMega Diancie 1
## 2 garchomp-tol.png GarchompMega Garchomp 2
## 3 kyurem-tol-b.png KyuremBlack Kyurem 3
## 4 kyurem-tol-w.png KyuremWhite Kyurem 4
## 5 latias-tol.png LatiasMega Latias 5
## 6 latios-tol.png LatiosMega Latios 6
## 7 metagross-tol.png MetagrossMega Metagross 7
## 8 salamence-tol.png SalamenceMega Salamence 8
## 9 ttar-tol.png TyranitarMega Tyranitar 9
## 10 arceus-tol.png Arceus 10
## 11 groudon-tol.png GroudonPrimal Groudon 11
## 12 kyogre-tol.png KyogrePrimal Kyogre 12
## 13 mewtwo-tol-x.png MewtwoMega Mewtwo X 13
## 14 mewtwo-tol-y.png MewtwoMega Mewtwo Y 14
## 15 ray-tol.png RayquazaMega Rayquaza 15
#download.file("https://cdn.bulbagarden.net/upload/5/58/384Rayquaza-Mega.png", "mega-raq.png")
img = readPNG("images/mega-raq.png")
g = rasterGrob(img, interpolate=TRUE)
total_graph <- pokemon %>%
select(Name, Total) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Total), y=Total)) +
geom_bar(aes(fill=Total), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Total)) +
coord_flip() +
scale_fill_gradient(low="#f6de00", high="#5abd8b") +
labs(x="Name", title="Top 10 Total Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=600, ymax=900)
for(i in 1:nrow(total_sprites)){
img = readPNG(paste0("images/",total_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
total_graph = total_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-40, ymax=10)
}
total_graph
pokemon %>%
ggplot(aes(x = PokedexNum, y = Total, color = classification)) +
geom_point()
Again, we see the overlap between Legendary and Mega (and to a lesser extent Normal) as well as the difference between most normal Pokemon and the two classes.
pokemon %>%
count(Generation) %>%
ggplot(aes(x=Generation, y=n, fill = Generation, color = Generation)) +
geom_bar(stat="identity") +
geom_label(aes(label=n)) +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon per generation") +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#cdb4d5", "#181820", "#6275b9"),
guide = "none") +
scale_color_manual(values = c("white", "white", "white", "white", "white", "white"),
guide = "none")
Generations 1,3, and 5 all introduce similar amounts of Pokemon (~160), while the other three Generations show more variance with the lastest generation having the least amount of new Pokemon introduced.
ggplot(pokemon, aes(x=Type1, fill=Generation)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation") +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#cdb4d5", "#181820", "#6275b9"),
guide = "none")
We see a similar distribution of types between Generations.
ggplot(pokemon, aes(x=Generation, fill=Type1)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation") +
scale_fill_manual(values = type_colors)
The same information is conveyed in this graph with the variables flipped.
ggplot(pokemon, aes(x=Generation, fill=isMultiType)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation")
About half of Pokemon in any generation have a secondary typing.
ggplot(pokemon, aes(x=Generation, fill=isMega)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation")
Megas seem to be reserved for Pokemon from older Generations. This is inline with the introduction, as only Generation 1 was allowed to have Mega Evolutions at first.
library(class) #for KNN
library(caret) #for cross validation of KNN method
test <- pokemon %>% group_by(classification) %>% sample_frac(.2)
train_data <- setdiff(pokemon, test)
dim(test)
## [1] 160 18
dim(train_data)
## [1] 640 18
glimpse(test)
## Rows: 160
## Columns: 18
## Groups: classification [3]
## $ PokedexNum <int> 483, 380, 486, 721, 642, 487, 494, 646, 718, 386, 640,…
## $ Name <chr> "Dialga", "Latias", "Regigigas", "Volcanion", "Thundur…
## $ Type1 <chr> "Steel", "Dragon", "Normal", "Fire", "Electric", "Ghos…
## $ Type2 <chr> "Dragon", "Psychic", NA, "Water", "Flying", "Dragon", …
## $ Total <int> 680, 600, 670, 600, 580, 680, 600, 660, 600, 600, 580,…
## $ HP <int> 100, 80, 110, 80, 79, 150, 100, 125, 108, 50, 91, 75, …
## $ Attack <int> 120, 80, 160, 110, 115, 120, 100, 130, 100, 70, 90, 12…
## $ Defense <int> 120, 90, 110, 120, 70, 100, 100, 90, 121, 160, 72, 70,…
## $ SpAtk <int> 150, 110, 80, 130, 125, 120, 100, 130, 81, 70, 90, 125…
## $ SpDef <int> 100, 130, 110, 90, 80, 100, 100, 90, 95, 160, 129, 70,…
## $ Speed <int> 90, 110, 100, 70, 111, 90, 100, 95, 95, 90, 108, 115, …
## $ Generation <fct> 4, 3, 4, 6, 5, 4, 5, 5, 6, 3, 5, 4, 3, 1, 4, 4, 2, 1, …
## $ Legendary <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, …
## $ AtkTotal <dbl> 270, 190, 240, 240, 240, 240, 200, 260, 181, 140, 180,…
## $ DefTotal <dbl> 220, 220, 220, 210, 150, 200, 200, 180, 216, 320, 201,…
## $ isMega <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
## $ isMultiType <lgl> TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,…
## $ classification <fct> Legendary, Legendary, Legendary, Legendary, Legendary,…
glimpse(train_data)
## Rows: 640
## Columns: 18
## $ PokedexNum <int> 1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 9, 10, 11, 12, 13, 14, 1…
## $ Name <chr> "Bulbasaur", "Ivysaur", "Venusaur", "Charmander", "Cha…
## $ Type1 <chr> "Grass", "Grass", "Grass", "Fire", "Fire", "Fire", "Fi…
## $ Type2 <chr> "Poison", "Poison", "Poison", NA, NA, "Flying", "Flyin…
## $ Total <int> 318, 405, 525, 309, 405, 534, 634, 314, 405, 530, 630,…
## $ HP <int> 45, 60, 80, 39, 58, 78, 78, 44, 59, 79, 79, 45, 50, 60…
## $ Attack <int> 49, 62, 82, 52, 64, 84, 104, 48, 63, 83, 103, 30, 20, …
## $ Defense <int> 49, 63, 83, 43, 58, 78, 78, 65, 80, 100, 120, 35, 55, …
## $ SpAtk <int> 65, 80, 100, 60, 80, 109, 159, 50, 65, 85, 135, 20, 25…
## $ SpDef <int> 65, 80, 100, 50, 65, 85, 115, 64, 80, 105, 115, 20, 25…
## $ Speed <int> 45, 60, 80, 65, 80, 100, 100, 43, 58, 78, 78, 45, 30, …
## $ Generation <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ Legendary <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE…
## $ AtkTotal <dbl> 114, 142, 182, 112, 144, 193, 263, 98, 128, 168, 238, …
## $ DefTotal <dbl> 114, 143, 183, 93, 123, 163, 193, 129, 160, 205, 235, …
## $ isMega <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE,…
## $ isMultiType <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FAL…
## $ classification <fct> Normal, Normal, Normal, Normal, Normal, Normal, Mega, …
class.knn.20 = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = 20)
class.knn.20
## [1] Legendary Normal Normal Normal Normal Legendary Legendary
## [8] Legendary Normal Normal Normal Normal Legendary Normal
## [15] Normal Normal Normal Legendary Legendary Normal Normal
## [22] Legendary Normal Normal Normal Normal Normal Normal
## [29] Normal Normal Normal Normal Normal Normal Normal
## [36] Normal Normal Normal Normal Normal Normal Normal
## [43] Normal Normal Normal Normal Normal Normal Normal
## [50] Normal Normal Normal Normal Normal Normal Normal
## [57] Normal Normal Normal Normal Normal Normal Normal
## [64] Normal Normal Normal Normal Normal Normal Normal
## [71] Normal Normal Normal Normal Normal Normal Normal
## [78] Normal Normal Normal Normal Normal Normal Normal
## [85] Normal Normal Normal Normal Normal Normal Normal
## [92] Normal Normal Normal Normal Normal Normal Normal
## [99] Normal Normal Normal Normal Normal Normal Normal
## [106] Normal Normal Normal Normal Normal Normal Normal
## [113] Normal Normal Normal Normal Normal Normal Normal
## [120] Normal Normal Normal Normal Normal Normal Normal
## [127] Normal Normal Normal Normal Normal Normal Normal
## [134] Normal Normal Normal Normal Normal Normal Normal
## [141] Normal Normal Normal Normal Normal Normal Normal
## [148] Normal Normal Normal Normal Normal Legendary Normal
## [155] Normal Normal Normal Normal Normal Normal
## Levels: Legendary Mega Normal
tibble(test[c("Name", "classification")], class.knn.20)
## # A tibble: 160 x 3
## Name classification class.knn.20
## <chr> <fct> <fct>
## 1 Dialga Legendary Legendary
## 2 Latias Legendary Normal
## 3 Regigigas Legendary Normal
## 4 Volcanion Legendary Normal
## 5 ThundurusIncarnate Forme Legendary Normal
## 6 GiratinaOrigin Forme Legendary Legendary
## 7 Victini Legendary Legendary
## 8 Kyurem Legendary Legendary
## 9 Zygarde50% Forme Legendary Normal
## 10 DeoxysDefense Forme Legendary Normal
## # … with 150 more rows
class.knn.conf.20 = table(true = test$classification, predicted = class.knn.20)
class.knn.conf.20
## predicted
## true Legendary Mega Normal
## Legendary 4 0 8
## Mega 4 0 6
## Normal 1 0 137
class.knn.50 = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = 50)
class.knn.50
## [1] Legendary Normal Normal Normal Normal Legendary Normal
## [8] Legendary Normal Normal Normal Normal Legendary Normal
## [15] Normal Normal Normal Legendary Normal Normal Normal
## [22] Legendary Normal Normal Normal Normal Normal Normal
## [29] Normal Normal Normal Normal Normal Normal Normal
## [36] Normal Normal Normal Normal Normal Normal Normal
## [43] Normal Normal Normal Normal Normal Normal Normal
## [50] Normal Normal Normal Normal Normal Normal Normal
## [57] Normal Normal Normal Normal Normal Normal Normal
## [64] Normal Normal Normal Normal Normal Normal Normal
## [71] Normal Normal Normal Normal Normal Normal Normal
## [78] Normal Normal Normal Normal Normal Normal Normal
## [85] Normal Normal Normal Normal Normal Normal Normal
## [92] Normal Normal Normal Normal Normal Normal Normal
## [99] Normal Normal Normal Normal Normal Normal Normal
## [106] Normal Normal Normal Normal Normal Normal Normal
## [113] Normal Normal Normal Normal Normal Normal Normal
## [120] Normal Normal Normal Normal Normal Normal Normal
## [127] Normal Normal Normal Normal Normal Normal Normal
## [134] Normal Normal Normal Normal Normal Normal Normal
## [141] Normal Normal Normal Normal Normal Normal Normal
## [148] Normal Normal Normal Normal Normal Normal Normal
## [155] Normal Normal Normal Normal Normal Normal
## Levels: Legendary Mega Normal
tibble(test[c("Name", "classification")], class.knn.50)
## # A tibble: 160 x 3
## Name classification class.knn.50
## <chr> <fct> <fct>
## 1 Dialga Legendary Legendary
## 2 Latias Legendary Normal
## 3 Regigigas Legendary Normal
## 4 Volcanion Legendary Normal
## 5 ThundurusIncarnate Forme Legendary Normal
## 6 GiratinaOrigin Forme Legendary Legendary
## 7 Victini Legendary Normal
## 8 Kyurem Legendary Legendary
## 9 Zygarde50% Forme Legendary Normal
## 10 DeoxysDefense Forme Legendary Normal
## # … with 150 more rows
class.knn.conf.50 = table(true = test$classification, predicted = class.knn.50)
class.knn.conf.50
## predicted
## true Legendary Mega Normal
## Legendary 3 0 9
## Mega 3 0 7
## Normal 0 0 138
trControl <- trainControl(method = "cv",
number = 20)
fit <- train(classification ~ HP + Attack + Defense + SpAtk + SpDef + Speed,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = train_data
)
fit
## k-Nearest Neighbors
##
## 640 samples
## 6 predictor
## 3 classes: 'Legendary', 'Mega', 'Normal'
##
## No pre-processing
## Resampling: Cross-Validated (20 fold)
## Summary of sample sizes: 607, 608, 609, 608, 608, 607, ...
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 1 0.8986214 0.5449567
## 2 0.9002343 0.5652602
## 3 0.8970146 0.5203502
## 4 0.8939874 0.4883874
## 5 0.9048332 0.5444305
## 6 0.9000388 0.5020137
## 7 0.9078070 0.5359710
## 8 0.9030691 0.4852282
## 9 0.9062949 0.5168652
## 10 0.9016990 0.4879702
## 11 0.9061941 0.4873755
## 12 0.9046316 0.4882326
## 13 0.9031164 0.4779515
## 14 0.8984763 0.4450063
## 15 0.9000861 0.4431106
## 16 0.8999914 0.4413798
## 17 0.8921759 0.3921029
## 18 0.8921728 0.3840586
## 19 0.8905630 0.3777948
## 20 0.8921255 0.3767443
## 21 0.8905630 0.3474226
## 22 0.8922297 0.3750946
## 23 0.8876369 0.3359588
## 24 0.8891994 0.3402897
## 25 0.8891047 0.3350749
## 26 0.8906672 0.3301477
## 27 0.8923274 0.3554081
## 28 0.8954051 0.3607145
## 29 0.8891520 0.3346144
## 30 0.8908123 0.3468491
## 31 0.8907619 0.3361628
## 32 0.8891994 0.3332768
## 33 0.8891520 0.3235309
## 34 0.8891520 0.3235309
## 35 0.8907145 0.3349036
## 36 0.8906672 0.3328042
## 37 0.8891520 0.3319202
## 38 0.8891047 0.3340157
## 39 0.8891047 0.3301286
## 40 0.8859228 0.3053315
## 41 0.8859228 0.2920119
## 42 0.8859228 0.2920119
## 43 0.8843603 0.2919405
## 44 0.8874853 0.3032871
## 45 0.8859702 0.2948302
## 46 0.8890478 0.3059627
## 47 0.8890478 0.3014605
## 48 0.8890478 0.3014605
## 49 0.8890478 0.2975734
## 50 0.8890478 0.2930713
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 7.
k = fit$results$k[which.max(fit$results$Accuracy)]
class.knn = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = k)
class.knn
## [1] Legendary Normal Mega Legendary Legendary Legendary Normal
## [8] Legendary Legendary Normal Normal Normal Mega Normal
## [15] Mega Normal Normal Mega Legendary Legendary Normal
## [22] Legendary Normal Normal Normal Normal Normal Normal
## [29] Normal Legendary Normal Normal Normal Normal Normal
## [36] Normal Normal Normal Normal Normal Normal Normal
## [43] Normal Normal Normal Normal Normal Normal Normal
## [50] Normal Normal Normal Normal Normal Normal Normal
## [57] Normal Normal Normal Normal Normal Normal Normal
## [64] Normal Normal Normal Normal Normal Normal Normal
## [71] Normal Normal Normal Normal Normal Normal Normal
## [78] Normal Normal Normal Normal Normal Normal Normal
## [85] Normal Normal Normal Normal Normal Normal Normal
## [92] Normal Normal Normal Normal Normal Normal Normal
## [99] Normal Normal Normal Normal Normal Normal Normal
## [106] Normal Normal Normal Normal Normal Normal Normal
## [113] Normal Normal Normal Normal Normal Normal Normal
## [120] Normal Normal Normal Normal Normal Normal Normal
## [127] Normal Normal Normal Normal Normal Normal Normal
## [134] Normal Normal Normal Normal Normal Normal Normal
## [141] Normal Normal Normal Normal Normal Normal Normal
## [148] Normal Normal Normal Normal Normal Normal Normal
## [155] Normal Normal Normal Normal Normal Normal
## Levels: Legendary Mega Normal
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
## # A tibble: 14 x 3
## Name classification class.knn
## <chr> <fct> <fct>
## 1 Latias Legendary Normal
## 2 Regigigas Legendary Mega
## 3 Victini Legendary Normal
## 4 DeoxysDefense Forme Legendary Normal
## 5 Virizion Legendary Normal
## 6 Azelf Legendary Normal
## 7 BeedrillMega Beedrill Mega Normal
## 8 LopunnyMega Lopunny Mega Normal
## 9 SteelixMega Steelix Mega Normal
## 10 SceptileMega Sceptile Mega Legendary
## 11 VenusaurMega Venusaur Mega Legendary
## 12 SwampertMega Swampert Mega Normal
## 13 LatiasMega Latias Mega Legendary
## 14 Cresselia Normal Legendary
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
## predicted
## true Legendary Mega Normal
## Legendary 6 1 5
## Mega 3 3 4
## Normal 1 0 137
nrow(error_knn)
## [1] 14
error_rate = (nrow(error_knn))/160
error_rate
## [1] 0.0875
detach("package:class", unload = TRUE)
testing_df <- train_data %>%
select(classification, HP, Attack, Defense, SpAtk, SpDef, Speed)
utrain <- upSample(testing_df[,-1], testing_df$classification)
table(utrain$Class)
##
## Legendary Mega Normal
## 554 554 554
fit <- train(Class ~ .,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = utrain
)
fit
## k-Nearest Neighbors
##
## 1662 samples
## 6 predictor
## 3 classes: 'Legendary', 'Mega', 'Normal'
##
## No pre-processing
## Resampling: Cross-Validated (20 fold)
## Summary of sample sizes: 1580, 1578, 1579, 1578, 1579, 1578, ...
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 1 0.9897870 0.9846803
## 2 0.9789428 0.9684118
## 3 0.9729110 0.9593646
## 4 0.9693033 0.9539542
## 5 0.9638813 0.9458223
## 6 0.9560568 0.9340810
## 7 0.9446679 0.9169943
## 8 0.9320306 0.8980401
## 9 0.9109588 0.8664401
## 10 0.8965069 0.8447613
## 11 0.8731050 0.8096610
## 12 0.8526432 0.7789634
## 13 0.8520262 0.7780401
## 14 0.8478452 0.7717811
## 15 0.8387867 0.7581726
## 16 0.8315714 0.7473524
## 17 0.8273755 0.7410587
## 18 0.8201534 0.7302256
## 19 0.8195582 0.7293107
## 20 0.8219608 0.7329182
## 21 0.8207123 0.7310522
## 22 0.8285735 0.7428740
## 23 0.8261781 0.7392506
## 24 0.8279561 0.7419504
## 25 0.8274483 0.7411740
## 26 0.8285957 0.7428943
## 27 0.8309260 0.7463805
## 28 0.8321672 0.7482451
## 29 0.8327549 0.7491222
## 30 0.8357890 0.7536641
## 31 0.8376396 0.7564451
## 32 0.8376111 0.7564207
## 33 0.8278775 0.7418213
## 34 0.8309337 0.7464250
## 35 0.8206990 0.7310590
## 36 0.8249087 0.7373734
## 37 0.8339385 0.7509232
## 38 0.8363700 0.7545758
## 39 0.8333431 0.7500273
## 40 0.8333652 0.7500503
## 41 0.8340038 0.7510316
## 42 0.8333793 0.7500933
## 43 0.8369870 0.7555082
## 44 0.8406235 0.7609784
## 45 0.8430333 0.7645757
## 46 0.8478167 0.7717541
## 47 0.8412331 0.7618986
## 48 0.8382280 0.7573816
## 49 0.8363842 0.7546206
## 50 0.8297644 0.7446846
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 1.
k = fit$results$k[which.max(fit$results$Accuracy)]
library(class) #for KNN
class.knn = knn(
train = utrain[,-7], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = utrain$Class, # vector of class labels for train_dataing data
k = k)
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
## # A tibble: 11 x 3
## Name classification class.knn
## <chr> <fct> <fct>
## 1 Regigigas Legendary Mega
## 2 Volcanion Legendary Mega
## 3 DeoxysDefense Forme Legendary Normal
## 4 Virizion Legendary Normal
## 5 BeedrillMega Beedrill Mega Normal
## 6 CharizardMega Charizard X Mega Normal
## 7 SceptileMega Sceptile Mega Legendary
## 8 SwampertMega Swampert Mega Normal
## 9 LatiasMega Latias Mega Legendary
## 10 Cresselia Normal Legendary
## 11 Manaphy Normal Legendary
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
## predicted
## true Legendary Mega Normal
## Legendary 8 2 2
## Mega 2 5 3
## Normal 2 0 136
nrow(error_knn)
## [1] 11
error_rate = (nrow(error_knn))/160
error_rate
## [1] 0.06875
detach("package:class", unload = TRUE)
dtrain <- downSample(testing_df[,-1], testing_df$classification)
table(dtrain$Class)
##
## Legendary Mega Normal
## 39 39 39
fit <- train(Class ~ .,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = utrain
)
fit
## k-Nearest Neighbors
##
## 1662 samples
## 6 predictor
## 3 classes: 'Legendary', 'Mega', 'Normal'
##
## No pre-processing
## Resampling: Cross-Validated (20 fold)
## Summary of sample sizes: 1579, 1578, 1579, 1579, 1580, 1579, ...
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 1 0.9897727 0.9846576
## 2 0.9807792 0.9711652
## 3 0.9729330 0.9593965
## 4 0.9681060 0.9521561
## 5 0.9651155 0.9476669
## 6 0.9566737 0.9350120
## 7 0.9476294 0.9214462
## 8 0.9356237 0.9034329
## 9 0.9158003 0.8737045
## 10 0.9031849 0.8547733
## 11 0.8682283 0.8023532
## 12 0.8550180 0.7825278
## 13 0.8514250 0.7771370
## 14 0.8465834 0.7698828
## 15 0.8423732 0.7635665
## 16 0.8351874 0.7527935
## 17 0.8261216 0.7392074
## 18 0.8201185 0.7302025
## 19 0.8195017 0.7292707
## 20 0.8200822 0.7301348
## 21 0.8200966 0.7301675
## 22 0.8225284 0.7337864
## 23 0.8279431 0.7418851
## 24 0.8321957 0.7482630
## 25 0.8267590 0.7401178
## 26 0.8214242 0.7321073
## 27 0.8244511 0.7366539
## 28 0.8231952 0.7347709
## 29 0.8250024 0.7374713
## 30 0.8334661 0.7501532
## 31 0.8346277 0.7519234
## 32 0.8394911 0.7592123
## 33 0.8292134 0.7437890
## 34 0.8244084 0.7365846
## 35 0.8214029 0.7320755
## 36 0.8262370 0.7393197
## 37 0.8286909 0.7430208
## 38 0.8340692 0.7511007
## 39 0.8371106 0.7556658
## 40 0.8334594 0.7501784
## 41 0.8352812 0.7529101
## 42 0.8370959 0.7556201
## 43 0.8382647 0.7573679
## 44 0.8376768 0.7564776
## 45 0.8370741 0.7555546
## 46 0.8413130 0.7619081
## 47 0.8400352 0.7600077
## 48 0.8382278 0.7572916
## 49 0.8322105 0.7482768
## 50 0.8267742 0.7401266
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 1.
k = fit$results$k[which.max(fit$results$Accuracy)]
library(class) #for KNN
library(caret)
class.knn = knn(
train = dtrain[,-7], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = dtrain$Class, # vector of class labels for train_dataing data
k = k)
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
## # A tibble: 24 x 3
## Name classification class.knn
## <chr> <fct> <fct>
## 1 Regigigas Legendary Mega
## 2 Volcanion Legendary Mega
## 3 BeedrillMega Beedrill Mega Normal
## 4 SceptileMega Sceptile Mega Legendary
## 5 LatiasMega Latias Mega Legendary
## 6 Lapras Normal Legendary
## 7 Cresselia Normal Legendary
## 8 GourgeistSuper Size Normal Mega
## 9 Crustle Normal Mega
## 10 Politoed Normal Mega
## # … with 14 more rows
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
## predicted
## true Legendary Mega Normal
## Legendary 10 2 0
## Mega 2 7 1
## Normal 10 9 119
nrow(error_knn)
## [1] 24
error_rate = (nrow(error_knn))/160
error_rate
## [1] 0.15
detach("package:class", unload = TRUE)
Thank you to Alberto Barradas for the dataset. Thank you Xavier for the inspiration for some of my EDA graphs. PokePalettes was a huge help in determining many of the HTML color codes to generate graphs. Bulbapedia for all the .png files seen in my graphs and being an essential “Pokemon encyclopedia” to the entire community. Pokemon, Game Freak, Nintendo for all the great memories and continuing to put out games so generation after generation can continue experience to experience the same euphoria.
knitr::opts_chunk$set(message = FALSE, warning = FALSE)
rm(list = ls())
library(tidyverse)
library(ggrepel)
library(png)
library(grid)
setwd("~/Desktop/code/R/Pokemon")
pokemon <- read_csv("Pokemon.csv")
glimpse(pokemon)
pokemon <- pokemon %>%
rename(SpAtk = `Sp. Atk`, SpDef = `Sp. Def`, Type1 = `Type 1`, Type2 = `Type 2`, PokedexNum = `#`)
pokemon <- pokemon %>%
mutate(AtkTotal = Attack + SpAtk,
DefTotal = Defense + SpDef,
isMega = grepl("Mega", Name, ignore.case = FALSE),
isMultiType = !is.na(Type2),
classification = if_else(isMega == TRUE, "Mega",
if_else(Legendary == TRUE, "Legendary", "Normal"))
)
factor_cols = c("Generation", "classification")
int_cols = c("PokedexNum", "Total", "HP", "Attack", "Defense", "SpAtk", "SpDef", "Speed")
pokemon[factor_cols] <- lapply(pokemon[factor_cols], factor)
pokemon[int_cols] <- lapply(pokemon[int_cols], as.integer)
glimpse(pokemon)
totals <- pokemon %>%
group_by(Type1) %>%
summarise(count = n())
# Generation 1 Color Scheme
pokemon %>%
ggplot(aes(x = fct_infreq(Type1))) +
geom_bar(fill = "#84ADD7", color = "#F2684A") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(title = "Frequency of Primary Types", x = "Type", y = "Frequency") +
geom_text(aes(Type1, count + 5, label = count, fill = NULL), data = totals)
totals <- pokemon %>%
filter(!is.na(Type2)) %>%
group_by(Type2) %>%
summarise(count = n())
# Genration 2 Color Scheme
pokemon %>%
filter(!is.na(Type2)) %>%
ggplot(aes(x = fct_infreq(Type2))) +
geom_bar(fill = "#C8CFD7", color = "#feff6a") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(title = "Frequency of Secondary Types", x = "Type", y = "Frequency") +
geom_text(aes(Type2, count + 5, label = count, fill = NULL), data = totals)
type_combinations <- pokemon %>%
mutate(Type2 = ifelse(is.na(Type2), "", Type2)) %>%
group_by(Type1, Type2) %>%
summarise(count=n())
#Pikachu Color Scheme
type_combinations %>%
ggplot(aes(x=Type1,y=as.character(Type2))) +
geom_tile(aes(fill = count), show.legend = FALSE) +
geom_text(aes(label=count)) +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
labs(x="Type 1", y="Type 2",
title="Type Combinations") +
scale_fill_gradient(low="#f6bd20", high="#c52018")
# Bug, Dark, Dragon, Electric, Fairy, Fighting, Fire, Flying, Ghost, Grass, Ground, Ice, Normal, Poison, Psychic, Rock, Steel, Water
type_colors = c("#A8B820", "#705848", "#7038F8", "#F8D030", "#EE99AC", "#C03028","#F08030","#A890F0",
"#705898", "#78C850", "#E0C068", "#98D8D8","#A8A878", "#A040A0", "#F85888", "#B8A038",
"#B8B8D0", "#6890F0")
type_colors_outline = c("#C6D16E", "#49392F", "#4924A1", "#A1871F", "#9B6470", "#7D1F1A", "#9C531F",
"#6D5E9C", "#493963", "#4E8234", "#927D44", "#638D8D", "#6D6D4E", "#682A68",
"#A13959", "#786824", "#787887", "#445E9C")
pokemon %>%
ggplot(aes(x = Type1, y = Total, fill = Type1, color = Type1)) +
geom_boxplot(show.legend = FALSE) +
labs(title = "Stats by Primary Type") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none")
pokemon %>%
filter(Legendary == TRUE) %>%
ggplot(aes(x=Type1, fill = Type1, color = Type1)) +
geom_bar(show.legend = FALSE) +
scale_fill_manual(values = type_colors[-c(1,6, 14)],
guide = "none") +
scale_color_manual(values = type_colors_outline[-c(1,6, 14)],
guide = "none") +
labs(title = "Primary Type of Legendary Pokemon")
# Latios/Latias Color Scheme
pokemon %>%
ggplot(aes(fill = Legendary, x=Type1)) +
geom_bar(position="stack") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = c("#cd696e", "#7db5da")) +
labs(title = "Legendary Pokemon by Primary Type", x = "Primary Type", y = "Frequency")
# Xerneas/Yveltal Color Scheme
pokemon %>%
ggplot(aes(fill = isMega, x=Type1)) +
geom_bar(position="stack") +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = c("#e9351c", "#6275b9")) +
labs(title = "Mega Pokemon by Primary Type", x = "Primary Type", y = "Frequency")
is_outlier <- function(x) {
return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}
pokemon %>%
ggplot(aes(x = Type1, y = HP, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "HP by Primary Type")
pokemon %>%
filter(is_outlier(HP) == TRUE) %>%
mutate(HPPercent = round(HP / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, HPPercent)
pokemon %>%
ggplot(aes(x = Type1, y = Attack, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "ATk by Primary Type")
pokemon %>%
filter(is_outlier(Attack) == TRUE) %>%
mutate(AtkPercent = round(Attack / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, AtkPercent)
pokemon %>%
ggplot(aes(x = Type1, y = Defense, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "DEF by Primary Type")
pokemon %>%
filter(is_outlier(Defense) == TRUE) %>%
mutate(DefPercent = round(Defense / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Defense, DefPercent)
pokemon %>%
ggplot(aes(x = Type1, y = SpAtk, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "SpAtk by Primary Type")
pokemon %>%
filter(is_outlier(SpAtk) == TRUE) %>%
mutate(SpAtkPercent = round(SpAtk / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, SpAtk, SpAtkPercent)
pokemon %>%
ggplot(aes(x = Type1, y = SpDef, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "SpDef by Primary Type")
pokemon %>%
filter(is_outlier(SpDef) == TRUE) %>%
mutate(SpDefPercent = round(SpDef / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, SpDef, SpDefPercent)
pokemon %>%
ggplot(aes(x = Type1, y = Speed, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Speed by Primary Type")
pokemon %>%
filter(is_outlier(Speed) == TRUE) %>%
mutate(SpeedPercent = round(Speed / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, Speed, SpeedPercent)
pokemon %>%
ggplot(aes(x = Type1, y = AtkTotal, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Total Atk by Primary Type")
pokemon %>%
filter(is_outlier(AtkTotal) == TRUE) %>%
mutate(AtkPercent = round(AtkTotal / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, AtkTotal, AtkPercent)
pokemon %>%
ggplot(aes(x = Type1, y = DefTotal, fill = Type1, color = Type1)) +
geom_boxplot() +
theme(axis.text.x=element_text(angle=45, hjust=1)) +
scale_fill_manual(values = type_colors,
guide = "none") +
scale_color_manual(values = type_colors_outline,
guide = "none") +
labs(title = "Total Def by Primary Type")
pokemon %>%
filter(is_outlier(DefTotal) == TRUE) %>%
mutate(DefPercent = round(DefTotal / Total, 2)) %>%
select(PokedexNum, Name, Type1, Type2, Total, DefTotal, DefPercent)
# Legendary Birds Color Scheme
pokemon %>%
ggplot(aes(x = classification, y = Total, color = classification, fill = classification)) +
geom_boxplot(show.legend = FALSE) +
scale_fill_manual(values = c("#d50808", "#ffd541", "#94c5ff"),
guide = "none") +
scale_color_manual(values = c("#ffc54a", "#9c7b10", "#005273"),
guide = "none") +
labs(title = "Total Stats by Classification")
pokemon %>%
ggplot(aes(x=Total)) +
geom_density(alpha=0.5, aes(fill=Type1)) +
facet_wrap(~Type1) +
labs(x="Total", y="Density") +
scale_fill_manual(values = type_colors,
guide = "none")
# Generation Mascot Color Scheme
pokemon %>%
ggplot(aes(x = Generation, y = Total, color = Generation, fill = Generation)) +
geom_boxplot() +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#205a94", "#181820", "#6275b9"),
guide = "none") +
scale_color_manual(values = c("#F2684A", "#9cace6", "#313973", "#bd6ad5", "#bdbdd5", "#e9351c"),
guide = "none") +
labs(title = "Total Stats by Generation")
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk > 100) %>%
ggplot(aes(x = Speed, y = MaxAtk)) +
geom_point(aes(color = Type1)) +
geom_smooth(method = 'lm') +
scale_color_manual(values = type_colors) +
labs(title = "Offensive Potential (Speed vs. MaxAtk)")
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk >= 160 & Speed > 120) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, SpAtk, MaxAtk,Speed)
pokemon %>%
mutate(MaxAtk = ifelse(Attack > SpAtk, Attack, SpAtk)) %>%
filter(MaxAtk >= 100 & Speed < 40) %>%
select(PokedexNum, Name, Type1, Type2, Total, Attack, SpAtk, MaxAtk,Speed)
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
ggplot(aes(x = HP, y = MaxDef)) +
geom_point(aes(color = Type1)) +
geom_smooth(method = 'lm') +
scale_color_manual(values = type_colors) +
labs(title = "Wall Potential (HP vs. MaxDef)")
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
filter(HP >= 150) %>%
arrange(-HP) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, Defense, SpDef, MaxDef)
pokemon %>%
mutate(MaxDef = ifelse(Defense > SpDef, Defense, SpDef)) %>%
filter(MaxDef >= 150) %>%
arrange(-MaxDef) %>%
select(PokedexNum, Name, Type1, Type2, Total, HP, Defense, SpDef, MaxDef)
# Scary Max
pokemon %>%
summarise(n = max(PokedexNum + 1),
name = "OP",
Total = max(HP)+max(Attack)+max(SpAtk)+max(Defense)+max(SpDef) + max(Speed),
HP = max(HP),
Attack = max(Attack),
SpAtk = max(SpAtk),
Defense = max(Defense),
SpDef = max(SpDef),
Speed = max(Speed))
pokemon %>%
filter(Total == max(Total))
pokemon %>%
summarise(n = min(PokedexNum - 1),
name = "OP",
Total = min(HP)+min(Attack)+min(SpAtk)+min(Defense)+min(SpDef) + min(Speed),
HP = min(HP),
Attack = min(Attack),
SpAtk = min(SpAtk),
Defense = min(Defense),
SpDef = min(SpDef),
Speed = min(Speed))
pokemon %>%
filter(Total == min(Total))
pokemon %>%
summarise(n = as.integer(mean(PokedexNum - 1)),
name = "OP",
Total = as.integer(mean(HP))+as.integer(mean(Attack))+as.integer(mean(SpAtk))+as.integer(mean(Defense))+as.integer(mean(SpDef)) + as.integer(mean(Speed)),
HP = as.integer(mean(HP)),
Attack = as.integer(mean(Attack)),
SpAtk = as.integer(mean(SpAtk)),
Defense = as.integer(mean(Defense)),
SpDef = as.integer(mean(SpDef)),
Speed = as.integer(mean(Speed)))
pokemon %>%
filter(Total == 432)
pokemon %>%
summarise(n = as.integer(median(PokedexNum - 1)),
name = "OP",
Total = as.integer(median(HP))+as.integer(median(Attack))+as.integer(median(SpAtk))+as.integer(median(Defense))+as.integer(median(SpDef)) + as.integer(median(Speed)),
HP = as.integer(median(HP)),
Attack = as.integer(median(Attack)),
SpAtk = as.integer(median(SpAtk)),
Defense = as.integer(median(Defense)),
SpDef = as.integer(median(SpDef)),
Speed = as.integer(median(Speed)))
pokemon %>%
filter(Total == 410)
# Chansey Color Scheme
pokemon %>%
ggplot(aes(x=HP)) +
geom_histogram(binwidth=4, fill="#ffacac", colour="#ff835a") +
labs(x="HP", y="Frequency")
# Landorus Color Scheme
pokemon %>%
ggplot(aes(x=Attack)) +
geom_histogram(binwidth=4, fill="#f67b41", colour="#83624a") +
labs(x="Attack", y="Frequency")
# Greninja
pokemon %>%
ggplot(aes(x=SpAtk)) +
geom_histogram(binwidth=4, fill="#354698", colour="#e7788d") +
labs(x="SpAtk", y="Frequency")
# Steelix
pokemon %>%
ggplot(aes(x=Defense)) +
geom_histogram(binwidth=4, fill="#7b94a4", colour="#dee6de") +
labs(x="Defense", y="Frequency")
# Shuckle
pokemon %>%
ggplot(aes(x=SpDef)) +
geom_histogram(binwidth=4, fill="#b43129", colour="#ffff5a") +
labs(x="SpDef", y="Frequency")
#Deoxys
pokemon %>%
ggplot(aes(x=Speed)) +
geom_histogram(binwidth=4, fill="#5294ac", colour="#ff734a") +
labs(x="Speed", y="Frequency")
#Mewtwo
pokemon %>%
ggplot(aes(x=Total)) +
geom_histogram(binwidth=10, fill="#6a319c", colour="#b4acc5") +
labs(x="Total", y="Frequency")
pokemon %>%
ggplot(aes(x=HP, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="HP", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Attack, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Attack", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=SpAtk, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="SpAtk", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Defense, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Defense", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=SpDef, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="SpDef", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Speed, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Speed", y="Density", title = "Legendary Comparison")
pokemon %>%
ggplot(aes(x=Total, fill=Legendary)) +
geom_density(alpha=0.5) +
labs(x="Total", y="Density", title = "Legendary Comparison")
pokemon %>%
group_by(Generation) %>%
summarise(avg = as.integer(mean(Total))) %>%
ggplot(aes(x=Generation, y = avg, group = 1)) +
geom_line() +
geom_point(color = "red") +
labs(title = "Average Total for each Generation")
pokemon %>%
group_by(Generation) %>%
summarize(HP=mean(HP),
Attack=mean(Attack),
Defense=mean(Defense),
Sp..Atk=mean(SpAtk),
Sp..Def=mean(SpDef),
Speed=mean(Speed)) %>%
gather(Stats, value, 2:7) %>%
ggplot(aes(x=Generation, y=value, group=1)) +
geom_line() +
geom_point(color = "red") +
facet_wrap(~Stats) +
labs(y="Average Stats")
#https://drmowinckels.io/blog/adding-external-images-to-plots/
#download.file("https://cdn.bulbagarden.net/upload/5/56/242Blissey.png", "blissey.png")
# download.file("http://cdn.bulbagarden.net/upload/f/f8/242MS.png", "blissey-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/ea/113MS.png", "chansey-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/f/fa/202MS.png", "wobb-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/ec/321MS.png", "wailord-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/5/5a/594MS.png", "alo-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/e/e0/143XYMS.png", "snorlax-h-sprite.png")
# download.file("cdn.bulbagarden.net/upload/0/0d/289MS.png", "slaking-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/e8/487MS.png", "g-alt-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2f/487OMS.png", "g-o-h-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/4/45/426MS.png", "drif-h-sprite.png")
hp_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-h-"),
pokemon %>%
top_n(10, HP) %>%
arrange(Name) %>%
mutate(rank = row_number(HP)) %>%
select(Name, rank)) %>%
arrange(rank)
hp_sprites
img = readPNG("images/blissey.png")
g = rasterGrob(img, interpolate=TRUE)
g_sprite = list()
hp_plot <- pokemon %>%
select(Name, HP) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, HP), y=HP)) +
geom_bar(aes(fill=HP), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=HP)) +
scale_fill_gradient(low="#ff835a", high="#ffacac") +
coord_flip() +
labs(x="Name", title="Top 10 HP Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=7, ymin=160, ymax=260)
for(i in 1:nrow(hp_sprites)){
img = readPNG(paste0("images/",hp_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
hp_plot = hp_plot +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-15, ymax=2.5)
}
hp_plot
# download.file("https://archives.bulbagarden.net/media/upload/6/67/150MXMS.png", "mewtwo-a-x.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/72/214MMS.png", "hera-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/98/383PMS.png", "groudon-a-p.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-a-aform.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c0/646BMS.png", "kyurem-a-b.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/7f/445MMS.png", "garch-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/91/409MS.png", "ramp-a-sprite.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/475MMS.png", "gallade-a-m.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/5e/354MMS.png", "banette-a-m.png")
atk_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-a-"), pokemon %>%
top_n(10, Attack) %>%
arrange(Name) %>%
mutate(rank = row_number(Attack)) %>%
select(Name, rank)) %>%
arrange(rank)
atk_sprites
#download.file("https://cdn.bulbagarden.net/upload/7/7f/150Mewtwo-Mega_X.png", "mega-X.png")
img = readPNG("images/mega-X.png")
g = rasterGrob(img, interpolate=TRUE)
g_sprite = list()
atk_graph <- pokemon %>%
select(Name, Attack) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Attack), y=Attack)) +
geom_bar(aes(fill=Attack), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Attack)) +
scale_fill_gradient(low="#b4acc5", high="#6a319c") +
coord_flip() +
labs(x="Name", title="Top 10 Attack Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=155, ymax=210)
for(i in 1:nrow(atk_sprites)){
img = readPNG(paste0("images/",atk_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
atk_graph = atk_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
atk_graph
# download.file("https://archives.bulbagarden.net/media/upload/2/29/181MMS.png", "ampharos-spa-,.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/34/282MMS.png", "gardevoir-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/f4/094MMS.png", "gengar-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/64/720UMS.png", "hoopa-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/74/646WMS.png", "kyurem-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0c/065MMS.png", "alakazam-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-spa-.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-spa-.png")
spatk_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",pattern = "-spa-"), pokemon %>%
top_n(10, SpAtk) %>%
arrange(Name) %>%
mutate(rank = row_number(SpAtk)) %>%
select(Name, rank)) %>%
arrange(rank)
spatk_sprites
#download.file("https://cdn.bulbagarden.net/upload/5/5f/150Mewtwo-Mega_Y.png", "mega-Y.png")
img = readPNG("images/mega-Y.png")
g = rasterGrob(img, interpolate=TRUE)
spatk_graph <- pokemon %>%
select(Name, SpAtk) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, SpAtk), y=SpAtk)) +
geom_bar(aes(fill=SpAtk), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=SpAtk)) +
scale_fill_gradient(low="#b4acc5", high="#6a319c") +
coord_flip() +
labs(x="Name", title="Top 10 SpAtk Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=4, ymin=160, ymax=210)
for(i in 1:nrow(spatk_sprites)){
img = readPNG(paste0("images/",spatk_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
spatk_graph = spatk_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
spatk_graph
# download.file("https://archives.bulbagarden.net/media/upload/5/5b/306MMS.png", "aggron-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/d/d5/306MS.png", "aggron-d-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/53/713MS.png", "avalugg-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ef/411MS.png", "bastiodon-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ac/091XYMS.png", "cloyster-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/6c/377MS.png", "regirock-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/213MS.png", "shuckle-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/65/080MMS.png", "slobrow-d.png")
# download.file("https://archives.bulbagarden.net/media/upload/b/bf/208MS.png", "steelix-d-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/05/208MMS.png", "steelix-d.png")
def_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/",
pattern = "-d"), pokemon %>%
top_n(10, Defense) %>%
arrange(Name) %>%
mutate(rank = row_number(Defense)) %>%
select(Name, rank)) %>%
arrange(rank)
def_sprites
#download.file("https://cdn.bulbagarden.net/upload/1/1b/208Steelix-Mega.png", "mega-steel.png")
img = readPNG("images/mega-steel.png")
g = rasterGrob(img, interpolate=TRUE)
def_graph <- pokemon %>%
select(Name, Defense) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Defense), y=Defense)) +
geom_bar(aes(fill=Defense), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Defense)) +
scale_fill_gradient(low="#dee6de", high="#7b94a4") +
coord_flip() +
labs(x="Name", title="Top 10 Defense Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=180, ymax=245)
for(i in 1:nrow(def_sprites)){
img = readPNG(paste0("images/",def_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
def_graph = def_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
def_graph
# download.file("https://archives.bulbagarden.net/media/upload/3/37/681MS.png", "aegislash-spd,.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/5b/703MS.png", "carbink-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/20/386DMS.png", "deoxys-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/53/719MS.png", "diancie-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ea/671MS.png", "florges-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/3c/706MS.png", "goodra-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/e/ee/250MS.png", "Hooh-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/24/380MMS.png", "latias-spd-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c8/249MS.png", "lugia-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/aa/476MS.png", "probopass-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/99/378MS.png", "regice-spd.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2e/379MS.png", "registeel-spd-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0a/213MS.png", "shuckle-spd.png")
spdef_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-spd"), pokemon %>%
top_n(10, SpDef) %>%
arrange(Name) %>%
mutate(rank = row_number(SpDef)) %>%
select(Name, rank)) %>%
arrange(rank)
spdef_sprites
#download.file("https://cdn.bulbagarden.net/upload/c/c7/213Shuckle.png", "shuckle.png")
img = readPNG("images/shuckle.png")
g = rasterGrob(img, interpolate=TRUE)
spdef_graph <- pokemon %>%
select(Name, SpDef) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, SpDef), y=SpDef)) +
geom_bar(aes(fill=SpDef), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=SpDef)) +
scale_fill_gradient(low="#ffff5a", high="#b43129") +
coord_flip() +
labs(x="Name", title="Top 10 SpDef Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=180, ymax=230)
for(i in 1:nrow(spdef_sprites)){
img = readPNG(paste0("images/", spdef_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
spdef_graph = spdef_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
spdef_graph
# download.file("https://archives.bulbagarden.net/media/upload/0/07/386AMS.png", "deoxys-spe-a.png")
# download.file("https://archives.bulbagarden.net/media/upload/8/86/386MS.png", "deoxys-spe-n.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/fa/386SMS.png", "deoxys-spe-s.png")
# download.file("https://archives.bulbagarden.net/media/upload/3/33/617MS.png", "accelgor-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c4/142MMS.png", "aerodac-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/0c/065MMS.png", "alakazam-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c8/015MMS.png", "buzz-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/69/101MS.png", "electrocude-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/54/291MS.png", "ninjask-spe.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/27/254MMS.png", "sceptile-spe.png")
speed_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-spe"), pokemon %>%
top_n(10, Speed) %>%
arrange(Name) %>%
mutate(rank = row_number(Speed)) %>%
select(Name, rank)) %>%
arrange(rank)
speed_sprites
#download.file("https://cdn.bulbagarden.net/upload/2/2b/386Deoxys-Speed.png", "speed.png")
img = readPNG("images/speed.png")
g = rasterGrob(img, interpolate=TRUE)
speed_graph <- pokemon %>%
select(Name, Speed) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Speed), y=Speed)) +
geom_bar(aes(fill=Speed), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Speed)) +
scale_fill_gradient(low="#5294ac", high="#ff734a") +
coord_flip() +
labs(x="Name", title="Top 10 Speed Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=150, ymax=200)
for(i in 1:nrow(speed_sprites)){
img = readPNG(paste0("images/",speed_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
speed_graph = speed_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-10, ymax=2.5)
}
speed_graph
# download.file("https://archives.bulbagarden.net/media/upload/0/0e/493OD_DP.png", "arceus-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ae/719MMS.png", "diancie-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/7f/445MMS.png", "garchomp-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/9/98/383PMS.png", "groudon-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/0/02/382PMS.png", "kyogre-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/c/c0/646BMS.png", "kyurem-tol-b.png")
# download.file("https://archives.bulbagarden.net/media/upload/7/74/646WMS.png", "kyurem-tol-w.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/24/380MMS.png", "latias-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/2/2d/381MMS.png", "latios-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/54/376MMS.png", "metagross-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/6/67/150MXMS.png", "mewtwo-tol-x.png")
# download.file("https://archives.bulbagarden.net/media/upload/f/ff/150MYMS.png", "mewtwo-tol-y.png")
# download.file("https://archives.bulbagarden.net/media/upload/a/ad/384MMS.png", "ray-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/4/4a/373MMS.png", "salamence-tol.png")
# download.file("https://archives.bulbagarden.net/media/upload/5/59/248MMS.png", "ttar-tol.png")
total_sprites <- tibble(images=list.files(path = "~/Desktop/code/R/Pokemon/images/", pattern = "-tol"), pokemon %>%
top_n(10, Total) %>%
arrange(Name) %>%
mutate(rank = row_number(Total)) %>%
select(Name, rank)) %>%
arrange(rank)
total_sprites
#download.file("https://cdn.bulbagarden.net/upload/5/58/384Rayquaza-Mega.png", "mega-raq.png")
img = readPNG("images/mega-raq.png")
g = rasterGrob(img, interpolate=TRUE)
total_graph <- pokemon %>%
select(Name, Total) %>%
top_n(10) %>%
ggplot(aes(x=reorder(Name, Total), y=Total)) +
geom_bar(aes(fill=Total), stat="identity", colour="black", show.legend=FALSE) +
geom_label(aes(label=Total)) +
coord_flip() +
scale_fill_gradient(low="#f6de00", high="#5abd8b") +
labs(x="Name", title="Top 10 Total Pokémon") +
annotation_custom(grob=g, xmin=0, xmax=5, ymin=600, ymax=900)
for(i in 1:nrow(total_sprites)){
img = readPNG(paste0("images/",total_sprites$images[i]))
g_sprite[[i]] = rasterGrob(img, interpolate=TRUE)
total_graph = total_graph +
annotation_custom(grob=g_sprite[[i]], xmin=i-5, xmax=i+5, ymin=-40, ymax=10)
}
total_graph
pokemon %>%
ggplot(aes(x = PokedexNum, y = Total, color = classification)) +
geom_point()
pokemon %>%
count(Generation) %>%
ggplot(aes(x=Generation, y=n, fill = Generation, color = Generation)) +
geom_bar(stat="identity") +
geom_label(aes(label=n)) +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon per generation") +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#cdb4d5", "#181820", "#6275b9"),
guide = "none") +
scale_color_manual(values = c("white", "white", "white", "white", "white", "white"),
guide = "none")
ggplot(pokemon, aes(x=Type1, fill=Generation)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation") +
scale_fill_manual(values = c("#2062ac", "#deac00", "#ff2029", "#cdb4d5", "#181820", "#6275b9"),
guide = "none")
ggplot(pokemon, aes(x=Generation, fill=Type1)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation") +
scale_fill_manual(values = type_colors)
ggplot(pokemon, aes(x=Generation, fill=isMultiType)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation")
ggplot(pokemon, aes(x=Generation, fill=isMega)) +
geom_bar() +
labs(x="Generation", y="Number of Pokémon",
title="Number of Pokémon of each primary type per generation")
library(class) #for KNN
library(caret) #for cross validation of KNN method
test <- pokemon %>% group_by(classification) %>% sample_frac(.2)
train_data <- setdiff(pokemon, test)
dim(test)
dim(train_data)
glimpse(test)
glimpse(train_data)
class.knn.20 = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = 20)
class.knn.20
tibble(test[c("Name", "classification")], class.knn.20)
class.knn.conf.20 = table(true = test$classification, predicted = class.knn.20)
class.knn.conf.20
class.knn.50 = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = 50)
class.knn.50
tibble(test[c("Name", "classification")], class.knn.50)
class.knn.conf.50 = table(true = test$classification, predicted = class.knn.50)
class.knn.conf.50
trControl <- trainControl(method = "cv",
number = 20)
fit <- train(classification ~ HP + Attack + Defense + SpAtk + SpDef + Speed,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = train_data
)
fit
k = fit$results$k[which.max(fit$results$Accuracy)]
class.knn = knn(
train = train_data[6:11], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = train_data$classification, # vector of class labels for train_dataing data
k = k)
class.knn
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
nrow(error_knn)
error_rate = (nrow(error_knn))/160
error_rate
detach("package:class", unload = TRUE)
testing_df <- train_data %>%
select(classification, HP, Attack, Defense, SpAtk, SpDef, Speed)
utrain <- upSample(testing_df[,-1], testing_df$classification)
table(utrain$Class)
fit <- train(Class ~ .,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = utrain
)
fit
k = fit$results$k[which.max(fit$results$Accuracy)]
library(class) #for KNN
class.knn = knn(
train = utrain[,-7], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = utrain$Class, # vector of class labels for train_dataing data
k = k)
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
nrow(error_knn)
error_rate = (nrow(error_knn))/160
error_rate
detach("package:class", unload = TRUE)
dtrain <- downSample(testing_df[,-1], testing_df$classification)
table(dtrain$Class)
fit <- train(Class ~ .,
method = "knn",
tuneGrid = expand.grid(k = 1:50),
trControl = trControl,
metric = "Accuracy",
data = utrain
)
fit
k = fit$results$k[which.max(fit$results$Accuracy)]
library(class) #for KNN
library(caret)
class.knn = knn(
train = dtrain[,-7], # train_dataing data for features used in classification
test = test[6:11], # test data data for features used in classification
cl = dtrain$Class, # vector of class labels for train_dataing data
k = k)
error_knn <- tibble(test[c("Name", "classification")], class.knn) %>%
filter(classification != class.knn)
error_knn
class.knn.conf = table(true = test$classification, predicted = class.knn)
class.knn.conf
nrow(error_knn)
error_rate = (nrow(error_knn))/160
error_rate
detach("package:class", unload = TRUE)